Bob Bailey is a pretty amazing guy. With his wife, Marion Breland, he owned and ran Animal Behaviour Enterprises and together they trained over 15,000 individual animals from 140 different species to do pretty much anything you can think of. I really like Bailey's work, I often turn to it when I can't see the way forward or just want a bit of inspiration. He has a way of stripping off all the anthropomorphism, superstition and emotion and just getting down to the core issues. One of the most profoundly interesting things I have ever read about training and yet one of the simplest is something that he wrote, "You get the behaviour you reinforce, not necessarily the behaviour you want."
Bailey takes any responsibility away from the animal being trained and gives it all back to the trainer. I think this a common thread that you see throughout the work of really good training theorists. Own your failures, not just your successes. (Andrew McLean is the master of this approach, though Andrew always gives you the feeling that he sympathizes with and understands your human frailty. Bob Bailey is perhaps more blunt but he's not primarily a coach, he's an animal trainer.)
Bailey takes any responsibility away from the animal being trained and gives it all back to the trainer. I think this a common thread that you see throughout the work of really good training theorists. Own your failures, not just your successes. (Andrew McLean is the master of this approach, though Andrew always gives you the feeling that he sympathizes with and understands your human frailty. Bob Bailey is perhaps more blunt but he's not primarily a coach, he's an animal trainer.)
It's really interesting to look at Bailey's idea through an equestrian lens. The fundamental reinforcement for riding horses is the release of pressure. That's how negative reinforcement works. It doesn't have to be strong pressure, any pressure that is enough to motivate a behaviour change will do. Removing that pressure is reinforcing. Every time we apply pressure to the horse and remove it indiscriminately we risk muddying the waters of his training because we are reinforcing other behaviours. It's a bit like a half halt... it seems to me that lots of riders just see a half halt as an opportunity to throw pretty much everything they've got at the horse without waiting for a result. The release of pressure, any pressure, must be contingent upon the correct behaviour.
Negative reinforcement is like going shopping for the right behaviour. The pressure stays on until you buy the behaviour you want by releasing the pressure. I think of it like this... if you're walking down the aisle at Coles looking for rice – you don't buy pasta because that's what you saw first, you keep going till you get to the rice. It's the same with the horse. Have a clear end-goal in mind and make your release contingent upon the correct behaviour.
Positive reinforcement is similar. Apples and carrots only reinforce the behaviour that immediately precedes them. That's why so many horses learn to mug their humans for food. We see it here quite often... the owner has a pocket full of treats, the horse nudges the pocket (maybe gently, maybe not so gently), the owner laughs and gives the horse the treat. I take a snapshot in my mind of the instant before the treat arrives at the horse's mouth – that's the behaviour that's being reinforced. It's quite often the horse lurching forwards or swinging his head towards the handler. It's not really a very useful behaviour to train. When humans become mobile carrot vending machines the range and variety of behaviours that are reinforced is quite mind boggling. What a master of positive reinforcement like Georgia Bruce can achieve with a clicker is awe inspiring but the indiscriminate and inadvertent use of ad-hoc positive reinforcement can cause a great deal of confusion for the horse.
That's my thought for the day... you get the behaviour you reinforce, not necessarily the behaviour you want. When things go wrong in training it's easy to reach for the bigger bit, the bigger gadget or the bigger spurs when really what we should be reaching for is the bigger brain.
PJ
Negative reinforcement is like going shopping for the right behaviour. The pressure stays on until you buy the behaviour you want by releasing the pressure. I think of it like this... if you're walking down the aisle at Coles looking for rice – you don't buy pasta because that's what you saw first, you keep going till you get to the rice. It's the same with the horse. Have a clear end-goal in mind and make your release contingent upon the correct behaviour.
Positive reinforcement is similar. Apples and carrots only reinforce the behaviour that immediately precedes them. That's why so many horses learn to mug their humans for food. We see it here quite often... the owner has a pocket full of treats, the horse nudges the pocket (maybe gently, maybe not so gently), the owner laughs and gives the horse the treat. I take a snapshot in my mind of the instant before the treat arrives at the horse's mouth – that's the behaviour that's being reinforced. It's quite often the horse lurching forwards or swinging his head towards the handler. It's not really a very useful behaviour to train. When humans become mobile carrot vending machines the range and variety of behaviours that are reinforced is quite mind boggling. What a master of positive reinforcement like Georgia Bruce can achieve with a clicker is awe inspiring but the indiscriminate and inadvertent use of ad-hoc positive reinforcement can cause a great deal of confusion for the horse.
That's my thought for the day... you get the behaviour you reinforce, not necessarily the behaviour you want. When things go wrong in training it's easy to reach for the bigger bit, the bigger gadget or the bigger spurs when really what we should be reaching for is the bigger brain.
PJ