Whenever we take a clicker and a small bucket of food rewards to a show it seems to generate a certain degree of scepticism and even disdain. We’ve been asked, “What happens when what’s in the bucket runs out?” and “So, I suppose you’ll have to carry that bag of treats around forever.” I’ve really never understood the resistance to positive reinforcement and I think it stems from a lack of understanding. There are several myths surrounding its use and (just like the theory of dominance and submission) those myths need to go the way of the flat earth theory. It’s 2017… We know stuff now!
Positive reinforcement is a form of operant conditioning and as such is just one of many tools in the horse trainers tool-kit. Like negative reinforcement, when done well it can be a useful and fundamental part of a horse’s training. When done badly it can cause the horse stress and confusion. Just like negative reinforcement, science can tell us a great deal about the best ways to use positive reinforcement and how to optimise its use.
Positive reinforcement is a form of operant conditioning and as such is just one of many tools in the horse trainers tool-kit. Like negative reinforcement, when done well it can be a useful and fundamental part of a horse’s training. When done badly it can cause the horse stress and confusion. Just like negative reinforcement, science can tell us a great deal about the best ways to use positive reinforcement and how to optimise its use.
First of all, for positive reinforcement to be effective the reward needs to be reinforcing – that is, it has to be something the horse really wants. We’ve all seen horses getting loud slappy pats on the neck after doing a jumping round or a dressage test. Chances are this isn’t reinforcing for the horse. Horses don’t pat each other so there is nothing in their 55 million year evolution that makes them want to seek out neck slapping or loud cries of praise. Horses are much more likely to find scratching or caressing reinforcing, particularly when the scratching is on the side of the neck, just below the wither. Studies have shown that when horses have elevated blood pressure (because of excitement or fear) caressing this spot will lower it. This is because this is the spot where they groom each other. Because the process of forming social bonds is so important to the horse’s survival, nature has equipped him with this convenient dopamine pump that only another horse (or an educated rider) can access. Neat hey.
Using combined reinforcement to train a young horse to jump into the water jump confidently and calmly. | Using a clicker to habituate a camel to having her teats touched in preparation for being milked. |
Secondly, to be reinforcing the reinforcement must be contingent on the desired behaviour. This means that you have to give the reinforcement RIGHT AFTER the behaviour you want to reward or it isn’t reinforcing for that behaviour. If your horse does a flying change in training and you want to reward him, it’s not really useful to take him back to the stable and feed him carrots because too much time has elapsed. It’s far more effective to scratch his neck immediately after the flying change. It is this aspect of positive reinforcement that has lead to the popular myth that the use of food rewards in training leads to biting. I’ll explain. One of the most important things to understand about behaviour is this: behaviour that gets reinforced will be repeated – whether you like it or not. So, if you open a bag of carrots and the horse starts mugging you for food or starts nipping at your hands and you give him the carrots, you have reinforced the behaviour that immediately preceded the moment when the carrot entered the horse’s mouth. You have reinforced the mugging and nipping behaviour and therefore that behaviour will be repeated. Make sure that you do not give the food reward if the horse mugs you. Carrots can be very exciting for some horses so we use a mix of chaff and Thompson and Redwood Clayton’s Pellets. We find this very useful – it’s highly palatable (and good for them) but not as exciting as carrots.
One way to make positive reinforcement even more effective is to use a conditioned reinforcer. This is something that signals to the horse that a reward is coming. In clicker training the click and a food reward are paired together and the click tells the horse that the reward is coming. Although famous scientists like Skinner wrote a great deal about conditioned reinforcers, dolphin trainers at Sea World were amongst the first animal trainers to utilise it commercially. In the process of trying to teach a dolphin to jump in the air the trainers discovered that he was offering smaller, flatter and faster jumps in order to get back to the side of the pool faster so he could get is fish reward. They needed a way of rewarding the dolphin when he was in the air so they paired the sound of a whistle with a fish and pretty soon they were able to shape the dolphin’s behaviour by rewarding him for different aspects of the jump. When you see photos of Sea World trainers with whistles in their mouths the whistle is a conditioned reinforcer – it’s not a way of telling the dolphin what to do, it simply bridges the inevitable gap between the correct behaviour and the reward. You can use a conditioned reinforcer without food too. A word (or words) such as “good boy” when paired with scratching and caressing of the neck can be a highly effective conditioned reinforcer. You might have to scratch your horse’s neck for what seems like a long time (2 minutes) before you see a response but it’s well worth the effort.
One way to make positive reinforcement even more effective is to use a conditioned reinforcer. This is something that signals to the horse that a reward is coming. In clicker training the click and a food reward are paired together and the click tells the horse that the reward is coming. Although famous scientists like Skinner wrote a great deal about conditioned reinforcers, dolphin trainers at Sea World were amongst the first animal trainers to utilise it commercially. In the process of trying to teach a dolphin to jump in the air the trainers discovered that he was offering smaller, flatter and faster jumps in order to get back to the side of the pool faster so he could get is fish reward. They needed a way of rewarding the dolphin when he was in the air so they paired the sound of a whistle with a fish and pretty soon they were able to shape the dolphin’s behaviour by rewarding him for different aspects of the jump. When you see photos of Sea World trainers with whistles in their mouths the whistle is a conditioned reinforcer – it’s not a way of telling the dolphin what to do, it simply bridges the inevitable gap between the correct behaviour and the reward. You can use a conditioned reinforcer without food too. A word (or words) such as “good boy” when paired with scratching and caressing of the neck can be a highly effective conditioned reinforcer. You might have to scratch your horse’s neck for what seems like a long time (2 minutes) before you see a response but it’s well worth the effort.
Using a target and clicker training to teach a camel to walk calmly into the stocks for milk collection. | Using positive reinforcement to re-train an elephant that previously wouldn't pick objects up for fear of being punished. The marker in this case is the word 'di' rather than a clicker or whistle. |
A particularly useful way to use positive reinforcement is to pair it with negative reinforcement. Unsurprisingly, this is usually called combined reinforcement. During negative reinforcement, a slightly aversive pressure such as the pressure of the rider’s legs is removed as soon as the horse goes forward. Because behaviour that is reinforced will be repeated, the horse is more likely to go forwards in the future when the rider applies leg pressure. This is how most of the horse’s basic responses are both trained and maintained and is why an understanding of the principles of negative reinforcement is of the utmost importance for riders. As an example, when using combined reinforcement the rider uses their leg and when the horse goes forward immediately scratches and caresses the horse’s neck. The negative reinforcement supplies the motivation and the positive reinforcement amplifies the reward. Combined reinforcement is particularly useful during early training or when tension has crept into an already trained response.
The third and last myth about positive reinforcement is that the horse will become reliant on it and will only perform for “treats”. This is not true. We regularly use either food rewards or scratching to reinforce various aspects of a horse’s jump training and once the training is established the positive reinforcement can be faded out. We use combined reinforcement often during foundation training (breaking in) and once again, when the behaviours are reliable the positive reinforcement can be faded out.
Using positive reinforcement in an objective and coherent way can not only significantly contribute to your horse’s training it can also be highly beneficial to the rider’s state of mind too. It seems to me that if you are always looking for desirable behaviours and reinforcing them you are far more likely to leave the arena feeling satisfied with your horse than if you are only ever looking for behaviours to correct. When we introduce people to the scientific principles of positive reinforcement it is wonderful to see how over a few weeks their entire attitude to their horse can change and, consequently, how their horse’s training can improve.
The third and last myth about positive reinforcement is that the horse will become reliant on it and will only perform for “treats”. This is not true. We regularly use either food rewards or scratching to reinforce various aspects of a horse’s jump training and once the training is established the positive reinforcement can be faded out. We use combined reinforcement often during foundation training (breaking in) and once again, when the behaviours are reliable the positive reinforcement can be faded out.
Using positive reinforcement in an objective and coherent way can not only significantly contribute to your horse’s training it can also be highly beneficial to the rider’s state of mind too. It seems to me that if you are always looking for desirable behaviours and reinforcing them you are far more likely to leave the arena feeling satisfied with your horse than if you are only ever looking for behaviours to correct. When we introduce people to the scientific principles of positive reinforcement it is wonderful to see how over a few weeks their entire attitude to their horse can change and, consequently, how their horse’s training can improve.
Teaching a horse to jump water calmly using combined reinforcement. | Using positive reinforcement to treat an eye injury in an elephant. |