A recent conversation on Facebook sparked some conversation regarding using varied reinforcers with horses and I mentioned that I had my own theories. Theories is probably the wrong word and I should have said "thoughts". A couple people asked me to expand so here it is.
I believe Dr Jesús Rosales-Ruiz did research showing that "jackpots" (for any species?) interrupted the flow of learning. I'm not sure how the research was done, nor what was defined as a jackpot. A fellow trainer whom I highly respect, Amanda Martin, says she took the question to her horses and they told her the same. So there are two camps that would seem to indicate that.
When I train dogs and help clients with their dogs, I do carefully use different values of reinforcers. Following Susan Garrett's model, I make sure we have a careful ranking of reinforcers- at least ten. This list will be different for each animal, including those dogs who prefer play (or work) to food. When I ask for responses to cues in a distracting environment, I use higher value reinforcers: more difficult for the dog to do, therefore it pays better. I stock my treat pouch with higher quality treats so that every effort while working on that skill is rewarded with a high value reinforcer. To me, that conditions that behavior more strongly with positive associations than if I used a lower value reinforcer. Later, when the behavior is better known or easier for any reason, I start using lower value reinforcers and the higher value ones get put away for a time when they are needed.
I used to use three different levels of treat with my horses: hay stretcher pellets, peppermint flavored horse treats and wrapped people peppermints. They were enjoyed by my horses in that order. I never noticed a change in the training flow but I never actually documented time or any other way of assessing whether there might be an interruption. I could easily see that the time it took to unwrap a peppermint slowed things down, and they probably did take longer to eat since the horses savored them. What I don't know is if that slowed or in any other way negatively affected the training or whether the speed was made up for by the increased value of the peppermint once the horse got the treat.
I think many years ago I kept all three treats in my pocket and would give the horse a specific treat dependent on the quality or effort they put in each offered behavior. Now I think that is the purpose of a shaping plan; once you get the behavior, you carefully shape such that you only click the efforts which are "average or better" (another technique learned from Susan Garrett)- the treat can remain the same. It's the clicks which pinpoint the quality. And it's my job to set the animal up so that he knows what "better" is. I don't want a frustrated learner who does not get clicked- I need to keep the ratio of clicked efforts to unclicked efforts very high.
When the research came out saying that jackpots interrupted the flow of learning, I faded the high quality treats myself for horses and almost exclusively use hay stretcher pellets now.
However, I do wonder, looking back, if my higher value treats with the horses were helpful. One thing I used the wrapped peppermints for was training Percy's recall and that is still an amazingly strong behavior for him. I also used to use it to end a session, and that did not work terribly well since it made him want more. Now I use a handful of hay stretcher pellets on the floor as an end of session indicator. Interestingly, emergency recalls are one place I still use high value treats with my own dogs. As I explain it to clients, I want that dog to hear my emergency recall cue and come FLYING to me IMMEDIATELY. I have several conversational recalls I use which mean anything from "hey, when you're done sniffing there, I really would like to continue walking" to "Hurry up, it's bloody cold out and I want to open the house door and go in". My emergency recall is for times such as the UPS truck is coming in the driveway at about 70 miles per hour and I want you all right here by my feet right now. Or they've disappeared into the woods after a sniff and I'm starting to worry so please return right now. For those times, my cue is a whistle and the dogs (those which can still hear) come racing back. For that, they get the best I have to offer- I usually try to keep string cheese in my pocket but if I've got steak fat trimmings or anything better, that's what they get.
I really do want that same reaction from my horses. One thing I love about clicker trained horses is the ability to recall them. I rarely use it to get them to come in for a training session. For one thing I only want one horse at a time and it seems more polite to go get that one than to have all six come running and then only work with one, even if I did reinforce each for coming. What they want is a training session. But in a pinch- a gate left open by mistake for instance, or foul weather when I really do want all of them to come in, I love to be able to recall and get an immediate response. So I think that wrapped peppermints are a legitimate reinforcer in that situation. I'm not in a training session where I need to keep the flow going. It's a one time "thank you!" which should reinforce that recall for the next time they hear it.
Grass is a very high value reinforcer for horses |
The other thing we have to realize is that if we get beyond simple training to secondary reinforcers, Premack and more advanced skills, we are using different value reinforcers with our horses. Any chains we build utilizing cued behaviors as reinforcers, using grazing as a reinforcer, etc all are examples of varying the strength, variety and amount of reinforcer. We have to decide on the appropriate reinforcer for the behavior if we want to use these tools well.
The last thing I want to mention is something fascinating which has happened with Percy. Percy isn't what I'd call a terribly food motivated individual. When he goes into his stall at night, he is more interested in what people and other horses are doing than his dinner. He may take a bite of hay and walk to the aisle window to look around while he chews it. He may stick his nose in the grain and shuffle around a bit. He always cleans up his hay eventually but he doesn't always eat all his grain. But he LOVES hay stretcher pellets. Ab-so-lutely goes bonkers for them. I have come to believe that the training process has given value to the hay stretcher pellets, as opposed to the opposite. If he hears me scooping handfuls of hay stretcher pellets into my treat pouch, he just about climbs in the barn window. So while I have always said I "just use plain old hay stretcher pellets" for training, in fact they could be a very high value treat for him, regardless of my opinion.
One of MY favorite reinforcers. |
Great blog! :)
ReplyDeleteI mostly use Alam cubes, a large pellet made of 30% flax seed. The horses like them, will work for them, and they're big enough to feed just one.
However, I have found that I will use Stud Muffins for really difficult tasks. OR if they'd rather wander off because there's something they don't like about what we're doing, I pull out an SM, break it into pieces, and feed those pieces one at a time to get them back in the game.
Once they're back in the game, I can switch back to the Alam cubes.
Once again, if they're not motivated to do the task I want to work on, I've got to change something - my behavior, Rate of Reinforcement, or reinforcer.
If I want to emphasize one piece of the trailering skills process, I use a Stud Muffin.
For dogs, I have tried using what one trainer called "fine dining" rather than "fast food". A large-ish piece of chicken is gulped down in a second by a dog. But that same piece of chicken can be cut up and fed one piece at a time for about 12-15 seconds, prolonging the reinforcing process. It's a jackpot, but done a little differently. This trainer recommended using this way of jackpotting for reliable recalls. :)
Thanks for the comment Laurie.
ReplyDeleteSomething I realize I should have added is that the amount of time to consume the reinforcer is critical as well. So rather than dog biscuits which take time to crunch up and will fill a dog up quickly are not good choices, whereas chicken in small bites is eaten quickly and doesn't fill them up as quickly. I have also read that a dog does not necessarily perceive a big treat as better than a little one, but several small ones is perceived as better, such as you describe with the chicken, but I've only used three at most. How many dogs have you tried this with and what have you found for results?
If I have an animal wander off during training, I generally let them go (that's a choice they make and I want to respect it and use that as information about my training). I'd then have a think about why they left. Using a better reinforcer at that point would feel like bribery to me and/or teaching a behavior chain (wander off, come back, get a better treat).
Love this post! Fascinating to hear about Percy's reaction to those hay stretcher pellets.
ReplyDeleteWith my horse I mostly use one kind of food reinforcer. If I throw in a different one in the midst of a session I find it sometimes makes him perk up a little, as if he enjoys the variety, while at other times it makes him seem a little puzzled, or even disappointed.
So from my limited experience it looks as if you're spot on about the criteria/the shaping plan being more important than what treat you're using (given it's something the animal likes, of course).
About feeding several small treats in a row, I've tried that too and found it useful. One thing it worked well for was when my horse was at the far end of the pasture and I wanted him to walk with me at liberty to the barn. I tried a couple of different techniques, targeting and clicking/treating along the way, but what worked best was feeding him small amounts for a longish time when he came to greet me, and then just start walking; when we arrived at the barn I'd repeat the same feeding process.
Forgot to add a thought about variety and the relative value of reinforcers. My horse likes sunflower seeds, and I like using them because I can feed pretty large amounts without worrying about sugar content. However, if I use them exclusively he will go off them and won't eat them. So if I want to stick to one food reinforcer, I need to make sure it's one that's valuable enough for him to enjoy in the long run.
ReplyDeleteThanks for your comments Lottie.
ReplyDeleteYour response to throwing in a new reinforcer seems to go along with what Jesús and Amanda found.
I love the story about your findings when walking with Aslak at liberty. Thank you for the detail. Very interesting. It makes perfect sense to me- sometimes breaking something up into little pieces also interrupts the process. It's as if they focus on the little pieces, rather than the whole, which is what you are after. I think for someone beginning with Clicker Training, targeting and frequent reinforcers would be required. But your history with Aslak and +R means that he already understood about walking at liberty, had a good relationship with you so was more than happy to accompany you and did not need the frequency, which might have had him wondering if you were clicking for something specific such as the way he was moving or touching the target. Do you think so?
Also interesting about the sunflower seeds. I suppose like me and that Ben and Jerry's ice cream. I sure do love it but after a pint, I've had enough (for that night!)
Oh, YES Jane! to what you're saying about focusing on detail rather than the whole. I hadn't thought about those pasture walks from that angle, but I'm sure you're right. I think I need to keep that thought in mind, it might help with some things I'm struggling with. Thank you!
ReplyDeleteThe sun flower seeds are intriguing, they're the only treat I've tried which Aslak has first eaten with gusto and then flatly refused to touch. I wonder if it has to do with fat content? Thinking about your ice cream example... rich food can get overwhelming after a while. With other things, like the low carb pellets I mostly use, he's okay having them day after day.
Jane -- I love this quote!
ReplyDelete"I have come to believe that the training process has given value to the hay stretcher pellets, as opposed to the opposite."
Best,
Mary
Mary, I mentioned that to Susan Friedman and she said she'd heard it before so I guess its not unique :)
ReplyDeleteHere's the link to the jackpot research: http://www.animaltrainingsolutions.com/Effects_of_Jackpots-K_Muir.pdf
ReplyDelete