Thursday, April 7, 2016

Jackpots and High Value Reinforcers: Riding the Swing of the Pendulum

When I first started clicker training over 15 years ago, jackpots were somewhat popular. Examples of jackpots were a larger number of treats or a special treat. Since my standard treat was 2 hay stretcher pellets, the jackpots I used were a full handful of pellets, a couple peppermint horse treats or a human peppermint candy. My horses really like the peppermint horse treats and loved the candy. Sometimes I also used cut up carrots or apples, but I liked being able to keep the peppermints on hand when the carrots and apples didn’t keep well in my pocket. 

The way I understood their use was that when the horse offered extra effort, you could give a jackpot to communicate that was a really good try. Over time, however, jackpots fell out of favor. A study done with dogs and conducted by Dr. Jesus Rosales Ruiz and his grad students indicated that jackpots interrupted the flow of training, thereby actually slowing training, rather than aiding it. I could understand how this could be, as my horses savored the peppermints to say nothing of the time it took me to unwrap the candies. I felt badly fading them out at first, but over time, found that my horses worked just fine for the hay stretcher pellets alone. I stopped using the peppermints...well, except for one particular learner on which they continue to be very effective, that being my husband.

Several years later, I took Susan Garrett’s Recallers course for dogs. I did the course both with a horse and dogs. Susan
speedy recalls: a result of high value reinforcers and slow progression
stresses knowing what your dogs consider to be high value reinforcers and using them carefully, but definitely using them. So back I went into experimentation mode. Susan’s course helped me understand a different way of using them. Rather than using them to point out good effort in shaping, she used higher value treats as a balance to distractions. The world is a fascinating place to dogs and in order to gain and keep the dog’s attention, we used cheese, hot dogs, tuna fudge and other delectable items to reinforce the dog for doing what we asked, as opposed to sniffing in the grass or leaving to say hi to the neighbor.

This is when I developed my own foundational concept that while for dogs, distractions are usually things they want to get TO; for horses, distractions are usually things they want to get AWAY FROM. Yes, this is a generalization; dogs have fears and horses have interests. Professional dog trainers certainly see our share of reactive dogs in one end of the spectrum or another. Probably ponies are most notorious for being distracted by wanting to get TO something (GRASS!). But as a whole, horses’ problems with distractability include the wind, the leaf blowing, the scary plastic bag: all things which may result in spooking and leaving the premises. This was the reason, I thought, that high value treats won’t work for horses’ distractions. If I want to work with my dog’s distraction, all I have to do is become more interesting than the distraction. But for a horse, it doesn’t matter how good a treat I have. If that horse is afraid, he’s not going to be dissuaded by something tasty. I love chocolate and you can certainly use it to train me, and you can use it to reinforce me for staying away from the computer. But if I think I’m about to be chased by a bear, you aren’t going to get me to stand around just because you’ve got chocolate. (note: so instead of using high value reinforcers, one needs to be more skilled at breaking things down, reading the horse's emotional responses, and becoming more trustworthy than the "distraction" is scary)

So I continued to use high value reinforcers for dogs, but not for horses. 

This brings me to the present when I am starting to hear criticism of using high value treats for distracted dogs and I am considering using high value treats with horses again. 

The criticism I hear in the dog world, and so far it’s been second hand, sounds like what I would call inappropriate use because people are using them in situations that are more horse-like! If you have a reactive dog, and a strong stimulus is present, pulling out liverwurst and waving it in front of your dog’s nose to keep his attention is not addressing the problem of reactivity. Maybe your dog responds and maybe he doesn’t but in my view, the issue isn’t being properly addressed. It’s as if you were using peppermints and apples to get a horse past something scary. It might work, but unless it was a very minor stimulus, you likely haven’t changed the way the animal feels and you may need to do it each time to get that horse past the scary thing.

The way I use high value treats with dogs is to begin training where the dog can be successful and I use the lowest value treat available which the dog will respond to. This varies with the individual. Some dogs, represented as “not food motivated”, can only be enticed with the finest of cheeses at first. But as Ken Ramirez says, “if your dog isn’t food motivated, then he’s dead!”. In time, by building a relationship and understanding, those fussy individuals can be trained with plain kibble. 

But once behaviors such as attention, loose leash walking and sits, for example, have been taught in a quiet home environment, we need to slowly introduce distractions. That’s when I pull out a higher value reinforcer. Following Susan Garrett’s guidelines, I rank the distractions for the dog in addition to ranking food (and activity!) reinforcers. Then I balance them. For one dog, a tennis ball on the floor might cause mayhem, so I wouldn’t use that as an initial distraction. For other dogs that ball might be interesting but not highly distracting. If they showed interest in it by looking at it, and then looking at me, I’d reinforce with some cheese. In my mind, the dog says, “ooh, she’s much more interesting than that silly ball”, and we continue to work together. Then I transition back to kibble, even in the presence of the ball. And I find a slightly more distracting item, such as someone slowly rolling the ball back and forth as we walk by. Movement is always more interesting so the dog might look at the rolling ball, but then remember how interesting I am so look at me and stay close on the leash. Out comes the cheese again. Through this slow and steady progress up to more and more distracting things, I teach the dog that I am ALWAYS more interesting than whatever else he might see. Through this process, the dog actually becomes desensitized to distractions in his environment. Personally, I'm not a fan of a dog who stares at me obsessively, but checking in with me and being able to respond to simple requests in the presence of distractions is a goal. Only when the dog is this responsive, even working with kibble, is he ready for me to raise the bar again, when I pull out something special to maintain my special status. 

So why would I ever need anything even higher value? Well, life and the environment can only be controlled to a certain point. Once we step outside of the home environment, we are subject to unplanned distractions. I want to have a really, really good base before I risk outings with potential for the unexpected. When that Rottweiler comes around the corner unexpectedly and barks, or when the UPS man suddenly comes screaming in the driveway, or when that rabbit streaks across the road in front of us, those are times I want something incredibly valuable on me so I can reinforce good responses. I’m not using them to prevent my dog from looking. I’m using them because our incredible history has allowed the dog to respond to me even in the face of a barking Rottweiler, a UPS truck or a bunny and I want to have something that demonstrates how amazing that response was! 

Using high value reinforcers in this way is not interrupting a shaping session and it’s not using the food to distract the dog from something. The way I see it, it’s simply reinforcing behavior I want to see repeated and using a reinforcer worthy of the effort of the behavior. 

So, back to horses, why would I start using high value reinforcers again with horses? Part of my thinking comes from my experiences at NEI (see here for more on that), and partly from one memorable experience with an equine and a high value reinforcer.

When we were at NEI, our instructors stressed the importance of being aware of where the animal was when we reinforced. Now both dog and horse trainers know how to feed for position and to set up for the next rep. But the birds were excellent trainers on more careful thought of this...of being aware of where the history of reinforcement was delivered. If you reinforced a bird on a small board “station” on a perch for just one session, that bird would go directly to that station the next time he was out. If that’s what you want from that bird, that’s great, but if you want flexibility in the behaviors, it’s not so great to have him glued to that station. 

An example of this was one of my training sessions with the corvid. I was to train him to fly from one perch at the front of the cage (I’ll call this perch A), to another perch at the back of the cage (perch B). Through the use of prompting, it was pretty easy to get the bird to fly to that perch and put it on cue. But one time I marked his landing on B, and in bird speed, he hopped immediately back to perch A and then to a third perch which was right in front of me...and I fed him there. Big mistake. Even though I had marked the landing on perch B, the next several times I prompted him to perch B, he instead flew to the perch where I had fed him...once.

Now I do think there were some other factors, namely that the trainers at NEI rarely used verbal markers. Because the birds are almost always looking at them when they train (as opposed to horses and dogs when we want them going away from us or not looking at us at times), the NEI trainers simply feed when the bird does the correct behavior. So even though we conditioned the word “good!” in our early training with the birds, I don’t think it had the power that a click has on my dogs and horses. If my animals hear a click, you can be sure they will repeat that behavior, regardless of where I fed for it . That’s why I can click for a dog sitting and then toss the treat to reset. The dog will return to me to sit (which was where I clicked), not go sit where the food was tossed. 

But in other situations, it does make a difference. Feeding where the perfect horse would be is an example. If we click when the horse is in the correct position next to us, but then feed close to our bodies, that horse’s head is going to be more likely to be too close next time. Yes, it may be what happens between the click and the feeding, but that’s what happened with the bird as well and it affected his future behavior, not just his treat taking behavior. 

In any case, this has made me more aware of what places I may be building up a history of reinforcement for. The story it made me think of was when I first taught Rumer to load into a trailer. At the time, I was still using peppermints as jackpots. In this case, she had never seen the trailer before but had a good history of being reinforced for approaching and touching items which might at first have seemed scary. 

Rumer's comfort with the trailer is maintained by careful driving and not trucking her before she was ready
I used hay stretcher pellets for reinforcing each step up the ramp, allowing her to back off when she wanted. But the next time in, she didn’t get treats at the bottom of the ramp, only when she had progressed beyond the previous attempts. I was building a history of more reinforcement inside. When she got all the way into the trailer, I pulled out a peppermint candy and gave it to her. There was no interruption of shaping; she was where I wanted her to be. Then I asked her to back off (there were many steps through this process which I’m glossing over) and we walked around in a little circle and approached the ramp again. Well that little pony just about dragged me onto the trailer. It was as if she was saying “The good treats are inside!”. All the way in she went, and she did indeed, get a peppermint candy once inside. 
So in that case, it was pretty obvious that place equaled high value reinforcer, and that in time, the value of the reinforcer could transfer so that it was a high value place. Now obviously, just getting in the trailer is only the very first step. She still needed to learn about the butt bar, the ramp, the trailer moving, longer trips, etc. But her enthusiasm of going all the way inside has stuck with me and seems to be a good reason to consider what and where we are giving treats.