Article Archives

Posts Tagged ‘positive reinforcement’

Positive or Negative Reinforcement

Terms of the Four Quadrants Identified

Written By Chris Biro, Copyright 5 January 2010

Increase Behavior Decrease Behavior
Add Something Positive Reinforcement Positive Punishment
Remove Something Negative Reinforcement Negative Punishment
A Freeflight list member wrote: "Since you mentioned the term "negative punishment". I wonder, Is there anything like "positive punishment"? I Thought all punishment were negative by their very nature." There is both positive reinforcement and negative reinforcement and positive punishment and negative punishment. Remember that these terms are scientifically defined, not just defined by common use. In Operant Conditioning terminology some words have very specific scientific meanings that differ from common public use and can take some getting accustomed to in order to use them correctly. Operant Conditioning is a leading field in the scientific study of learning and behavior. The main concept is that all living creatures repeat behaviors that have rewarding consequences for them and avoid behaviors that make bad things happen to them. In common usage we think of consequences as coming in two forms, rewarding and aversive but in the scientific definition consequences increase or decrease the target behavior. The active elements of a consequence are either added or removed. Reinforce means the consequence causes the target behavior to increase, be maintained or more likely to happen again in the future. Punishment means the consequence causes the target behavior to reduce or be less likely to happen again in the future. Positive (+) is to add something. Negative (-) is to remove something. Note: the "something" is a consequence immediately following the behavior that results in increased or decreased future behavior. So: Positive Reinforcement (R+) is to add a reinforcer that results in increasing the behavior. Generally this means the bird's behavior causes desired things to be added so the behavior increases. Negative Reinforcement (R-) is to remove a reinforcer that results in increasing the behavior. Generally this means the bird's behavior causes undesired things to be removed so the behavior increases. Positive Punishment (P+) is to add a punisher that results in reduction of the behavior. Generally this means the bird's behavior causes undesired things to be added so the behavior decreases. Negative Punishment (P-) is to remove a punisher that results in reduction of the behavior. Generally this means the bird's behavior causes desired things to be removed so the behavior decreases. Note: From a strict technical sense the punisher or reinforcer are not qualified as good or bad, desired or undesired. They are defined purely by their effect on the future behavior. Generally though reinforcers are desired and punishers are to be avoided. Defining the target behavior is critical in determining which quadrant is involved. It is currently popular to limit training to Positive Reinforcement (R+) but it should be remembered that each of these four quadrants exist in real life experiences. Each quadrant does in fact have actual training value, even if only in very limited and specific circumstances. And though we focus most of our training efforts in the positive reinforcement quadrant doing so can at the same time be viewed as using negative punishment when withholding the treat for poor performance. Things can start getting a little confusing if you think of withholding the treat as diminishing the specific poor behavior (negative punishment) and at the same time withholding the treat increases the specific good behavior (negative reinforcement). Often there is another cross quadrant simultaneously in play. For more on how to actually use positive reinforcement in training, read the articles Click & Treat and Clicker Training and Target Training.

Parrots: more than pets, friends for life.

Chris Biro

Variable or Differential Reinforcement?

Should A Variable Schedule of Reinforcement Be Used?

Written by Chris Biro, Copyright 2008

The Question: "As advised by a friend of mine who trains dolphins and killer whales, I'm now using variable reinforcement so she does not get a seed every single time she comes to me. I'm hoping this is the right thing to do. Can anyone verify that?" Actually I think it all depends on the behavior you are working on building. During the training phase I ALWAYS give a reinforcer for EVERY correct performance. But what is a correct performance? That depends entirely on the element of the behavior you are currently working on. Flying to you is a very broad definition. Do you mean flying to you when called? Immediately flying toward you when called? And just how long is too long of a delay before the bird flies to you when called? If you are in fact working on a quick response to recall, then where you are in this will determine what is the "correct" behavior. Every correct behavior should be reinforced with a treat. The only performances that should occasionally not get reinforced are the behaviors that are only 'close' to your criteria standard but not actually reaching the criteria standard. And the only behaviors that should never be reinforced are those behaviors that are way below your standard - even this is not absolute because sometimes we need to back up a few steps or relax a previous criteria to get another started. And all of this should mean that the bird is being reinforced for about 8 out of 10 (4 of 5) responses. From an observer's view, this could look like a variable schedule but really it is basic differential reinforcement - the selective reinforcement of one aspect of a behavior pattern to the exclusion of other aspects. The reinforcement schedule is not actually being varied - rarely do animal trainers train with such precise record keeping to actually use a variable schedule. If the animal trainer is in fact not reinforcing every correct performance, it is usually technically more correct to label this as "intermittent schedule of reinforcement" rather than a variable schedule or variable ratio. A variable schedule involves a precisely determined schedule that is varied by some set formula. I have yet to meet an animal trainer that does anything even remotely close to this. Likewise, people often incorrectly refer to "variable ratio" which means the reinforcement is handed out on a precisely varied ratio. Intermittent reinforcement simply means the reinforcements are not on a set schedule or ratio and usually best fits what a few animal trainers use when not using a click = treat (1:1 ratio of reinforcement). So to answer your question directly, I believe that when training a behavior it is important to offer a seed every time the bird does the behavior correctly. Once the behavior is fully trained, then you can start relaxing the rate of reinforcement to a point that will continue to produce adequate response levels. Chris Biro