Stacks Image 435

5. Reinforcement & Punishment

The front image of this chapter shows the conceptual framework that has been developed for pulling together the basic findings of research on how behaviour changes as a consequence of interactions with the environment. These findings are derived from studies that have examined how the frequency or future likelihood (i.e., the probability) of a behaviour is affected by consequences. It turns out that there are two basic effects on the probability of behaviour. Apart from remaining unchanged, the probability can either increase (i.e., Reinforcement effect) or it can decrease (i.e., Punishment effect). It really is as simple as that, at least at first glance.

There are, however, quite a number of other considerations to be noted, some of which are technical in nature and some of which concern the ways in which basic findings can be used in applied settings (e.g.,
Education). In this chapter we will look briefly at some of these issues and we will provide links to where they are discussed in more detail.

Our starting point is to note that, under the headings of Reinforcement and Punishment in the opening image, there are two other terms, ‘positive’ and ‘negative’. These terms refer to aspects of any procedure that results in either a reinforcement or punishment effect. The term ‘
positive’ in this context refers to the fact that something was added to the environment as a consequence of the behaviour. The term ‘negative’ refers to the removal of something from the environment as a consequence of the behaviour. Students often get confused at this point so let’s see if we can prevent any confusion by getting you to remember two points before you analyse the procedure responsible for a behaviour. When you see behaviour occurring with a high probability, or at a high rate, then, whatever the procedure, you may be witnessing an example of a reinforcement effect; ideally, though, you would need to see the behaviour prior to this observation to judge whether there has been an increase in the probability of behaviour. The opposite is true if you observe a very low rate of responding; again, though the low rate would have to be relative to a previously higher rate before we can call it an example of a punishment effect. So, think first about how the probability of behaviour has been affected to help you understand whether you are observing a reinforcement or punishment effect. The next step is to see if something was added (positive) or taken away (negative) as a consequence of the behaviour.

If it is observed that certain environmental events occur only after a behavioural event has occurred, then the term 'contingency' is used to refer to the relation between these events. Putting it formally, a contingency specifies an IF-THEN relation between two events. The existence of a contingency between behaviour and an environmental event crucially determines the future probability of behaving. That is to say, an environmental event itself does not function inevitably as a reinforcer.

This can be seen easily from an experiment involving an animal that has been trained to press a lever for a food reinforcer. In this instance, the experimenter has arranged a contingency between lever pressing and food delivery so that the absence of lever pressing results in the absence of food delivery.

A simple way to examine this contingency is to break it and see what happens. We can break it in either of two ways:
(a) Lever pressing no longer produces food. If this were to happen, the animal would stop lever pressing. Nothing too surprising there!

(b) Food is delivered non-contingently, i.e., it is delivered freely so that lever pressing is unnecessary to produce it. If this were to happen, the animal would stop lever pressing.

A human example of such
non-contingent delivery of environmental events can be seen in many residential homes for older people. It is quite common to find older people who are physically able but who are ‘robbed’ of their independence when everything is done for them in the care home; their self-esteem disappears and they get depressed. Studies have shown that when they are given back control, even over some basic contingencies such as preparing their own food, they can become revitalised and feel much better.

In any science, scientists have had to invent many new terms to help them deal with the complexities that arise in the study of a particular issue. In Behaviour Analysis, the term 'operant' is one such term. It was coined to help deal with a common observation that more than one behaviour will produce the same environmental event. For example, there are many ways to write a sentence, ask a question, begin a relationship, greet someone, or study for an examination.

B. F. Skinner dealt with this complexity by noting that on many occasions discrete behaviours constitute a 'class' of behaviours with a common function. That is, as a class of behaviours they ‘operate’ on the environment to produce the same effect. The word 'operant' is derived from the word 'operate' and it is used as a label to refer to a class of behaviours with a common function. In fact, when behaviour analysts talk about how certain consequences affect the future probability of behaviour, they implicitly refer to operant behaviour.

Reinforcement + (Positive)

Stimuli or events are viewed as positive reinforcers only when:
(a) there is a contingency between the occurrence of behaviour and the presentation of a stimulus;
(b) and the consequence of presenting a stimulus is an increase in the
probability of behaviour occurring again (Movie 5.1).

“Although similar, the terms ‘reinforcer’ and ‘reward’ are not identical. A reward is a form of recompense – usually an item or maybe an activity, often selected arbitrarily on the assumption that it will get more of a wanted behavior. A reinforcer must have demonstrated its effectiveness as a stimulus for increasing or sustaining a person’s behavior in the given context. Frequently rewards do function as reinforcers; occasionally … they do not. In applying behavior analysis, you need to be sure that any rewards you select indeed are reinforcers.” (Sulzer-Azaroff & Mayer, 1991, p. 140)

“Praise from peers and supervisors, feedback in the form of data showing client improvement, or other information about teaching or therapeutic effectiveness, and even feedback derived from self-recordings have been shown to be especially effective reinforcers… . Generally, though, compounding feedback with praise or other reinforcers is more effective than feedback alone.” (Sulzer-Azaroff & Mayer, 1991, p. 164-165)

Unconditioned reinforcers are prescribed by genetic inheritance. For example, sexual satisfaction, food, drinks, and warmth are all strong reinforcers in the right context. However, the extent to which they are effective depends also on whether or not an organism has been deprived of them. An organism that has just been fed is less likely to respond to food as a reinforcer compared to an organism that is hungry.

Conditioned reinforcers have played an important part in the theory of behaviour analysis and its applications. Conditioned reinforcers develop as a result of a person’s learning history. That being the case, the reinforcing strength of a particular conditioned reinforcer varies from individual to individual.
“Newborn Tina’s parents feed, hold, and cuddle her, using certain gestures and vocal tones. When pleased with her actions they praise her: “Good baby.” Over a long time, the gestures, tones, words, and other events that have been frequently paired with food and comfort begin to signal that food or comfort is apt to be forthcoming, and eventually the signal itself becomes reinforcing.” (Sulzer-Azaroff & Mayer, 1991, p. 144-145)

Generalised Reinforcers
These are related to conditioned reinforcers. They are effective for a wide range of behaviours because they have been paired with a wide range of reinforcers, unconditioned and conditioned. Tokens, including money, vouchers, certificates, grades, are a good example. An advantage of using generalised reinforcers is that they can be delivered almost immediately after the desired behaviour has occurred.

Intrinsic Reinforcement
This is a term used to refer to the reinforcing effect produced by behaviours that have 'natural' consequences. For example, the sound of a well played musical instrument is intrinsically reinforcing for playing the instrument.
“... intrinsic reinforcers, also called automatic reinforcers ... are consequences that result automatically from engaging in the behavior. Solving a crossword puzzle, viewing attractive paintings, listening to beautiful music, and reading a good novel are all possible intrinsic reinforcers because the consequences of these behaviors occur automatically.” (Grant & Evans, 1994, p. 37)

Extrinsic Reinforcers
This is a term used to describe “consequences external to ourselves, such as money, applause, and verbal praise from others. These extrinsic consequences do not automatically occur as result of engaging in a behavior.” (Grant & Evans, 1994, p. 37)

Premack principle
“Is it water or the act of drinking that is a reinforcer for a thirsty animal? Is a toy a reinforcer for a child or is it the behavior of playing with the toy? Premack proposed that it is more accurate to characterize the reinforcement procedure as a contingency between one behavior and another than as a contingency between a behavior and a stimulus.
Premack's principle ... provides a straightforward method for determining whether one behavior will act as a reinforcer for another. ...
Premack suggested that instead of postulating two categories of behaviors - reinforceable behaviors and reinforcing behaviors - we should rank behaviors on a scale of probability that ranges from behaviors of high probability to those of zero probability. Behaviors higher on the probability scale will reinforce behaviors lower on the probability scale.” (
Mazur, 1994, p. 193)

“Reinforcers and punishers are relative. ... The ability to reinforce or punish behavior is not an intrinsic quality of a stimulus. Stimuli that can function as reinforcers or punishers are not transituational, that is, they need not have the same effects in all situations. A given stimulus can function as a reinforcer, neutral stimulus, or punisher in different situations. During childhood a person may consistently avoid eating brussels sprouts; yet in adulthood, the person may love eating them. Thus, brussels sprouts functioned as a punisher in childhood and a reinforcer in adulthood. Some women enjoy sexual relations for years; but after a traumatic rape any behavior that might lead to a sexual interaction becomes suppressed. For these women, sex was a reinforcer until they were raped, then it became a punisher. Grade school children often hold on to their parents while being hugged and kissed at home with the family; yet they may pull away if a parent hugs or kisses them in front of their friends. That is, the children respond to hugs as reinforcers at home by maintaining contact, yet respond to hugs in public as punishers by avoiding them.” (Baldwin & Baldwin, 1986, p. 93)

Delay of Reinforcement
The basic finding is that the longer the delay the less likely you are to observe a reinforcement effect. In other words, immediate reinforcers are more effective than delayed reinforcers.
“For example, Ramey and Ourth [1971] reinforced infant vocalizations by lightly touching [an] infant's stomach, smiling, and saying, "That's a good baby." They did this either immediately following infant vocalizations or 3 or 6 seconds after an infant vocalization. [They found that] only the immediate reinforcers were effective in strengthening vocalizations.” (Grant & Evans, 1994, p. 49)
Of course, in real life there are many occasions when long delays of reinforcement are able to produce a reinforcement effect. Take, for example, the amount of work you will be doing on your degree program before you get the big pay-off! The long-term consequence of getting your degree and a job is supplemented with smaller consequences along the way to help sustain your studying behaviour.
“Another line of research has suggested that it is desirable to teach people to learn to avoid immediate reinforcers in favor of larger delayed reinforcers (Mischel, 1984, 1986). Mischel (1984) found that preschool children who had the ability to delay their reinforcers had, when they reached adolescence, more social competence, better thinking and coping skills, and better school achievement than their counterparts who were unable to delay their reinforcers. Similarly, the notion of self-control has been identified with the ability to choose a larger delayed reward rather than a smaller immediate reward, while the concept of impulsiveness has been identified with choosing a smaller immediate reward rather than a larger delayed reward...” (Grant & Evans, 1994, p. 50-51)

What is the relation between a reinforcement procedure and bribery? This is an important question, which if left unanswered can result in demonstrably effective procedures not being used to help people change their behaviour for fear of bribing them. “A bribe is something of value given to a person so as to bring about a change in the behavior of that person in contravention of the otherwise existing rules and laws to which that behavior is subject.” (Harzem, 1996, personal communication)

A common mistake is to say that stimuli function as reinforcers because they make one feel good. Feelings arise because of the consequences that were arranged for behaviour. The explanation for the feelings and the accompanying reinforcement effect is found in the environmental contingencies. Details of physiological activity associated with the feelings tell more of the story of how a person has been changed by these contingencies.

Reinforcement - (Negative)

Remember, now, we are dealing with Reinforcement. So don’t go thinking that the procedures described under this current heading result in a decrease in the probability of responding (Fig. 5.1).

No, the term ‘
negative’ here refers to an aspect of the procedure that results in a Reinforcement effect. You are right if you think it is a little confusing and there has been discussion on the need for a change in the distinction between positive and negative reinforcement.
Picture a scene where an animal is in a Skinner box and an irritating noise is being played through a small speaker located in the roof. If you turned off the noise for a brief period after the animal pressed a lever, what might you see subsequently? More than likely you would see the animal pressing the lever again when the noise was turned on. If each lever press produced a brief respite from the noise, then after a while the animal's rate of responding on the lever would increase. Such an increase in the rate of lever pressing would be an example of negative reinforcement for lever pressing because an event (the noise) is taken away (-) contingent upon the lever pressing behaviour.

In the example above, the initial lever press would have resulted in 'escape' from an environmental event; i.e., something would have been removed from the environment. The observation that lever pressing subsequently increased in frequency might lead you to conclude that escape behaviour is a prerequisite for negative reinforcement to occur.

Avoidance (see Movie 5.2)
How could you test the idea above?
1. Well, you might vary the time between noise presentations to see what happens; you could make the noise come on again after either a fixed or random period of time since the last noise terminated. This procedure, however, would not affect the escape contingency. That is, a lever press would still be effective only when the noise is on!
2. Another thing you could do is to arrange for lever pressing to prevent the noise coming on again once it was switched off. For example, you might arrange a contingency whereby, once the noise was switched off, each lever press resulted in a delay of x seconds before it came on again. If you did this, and if the animal pressed the lever before the x seconds was up, the noise would be '
avoided'. Continued lever pressing within the crucial x-seconds time period would result in continued avoidance of the noise. In effect, the noise would never be presented again if there was a suitable rate of lever pressing!
It is easy to imagine that these new contingencies would maintain a relatively high rate of lever pressing. And you would be right, for this is precisely the finding that researchers have obtained. Under these circumstances you would observe lever pressing being negatively reinforced because an avoidance contingency was in operation.
In summary, the two main classes of behaviour associated with negative reinforcement are escape and avoidance.
Baldwin and Baldwin (1981) note that
“children learn to escape wet underpants by removing them. Later they learn to avoid having wet underpants by avoiding urinary accidents. First they learn to escape the class bully by running when they see him. Later, they learn to avoid places where the bully is. As people grow up and gain experience, they usually learn increasing skills in … avoiding problems, and avoidance becomes more common than escape.” (p. 19)
Other examples include: turning down the volume of the car radio to escape from the loud noise; turning the volume control of the radio off before you start the car to avoid loud noise; locking your car to avoid it being stolen; going inside to put on warm clothes when it gets cold outside; putting on warm clothes on a cold day to avoid feeling cold; conforming to societal norms to avoid being rejected; increasing communication skills in relationships to avoid problems; paying taxes to avoid imprisonment; studying for exams to avoid the consequences for your career of failing the exams; giving in to a child who is demanding a bar of chocolate (here the behaviour of buying the chocolate is likely to have been negatively reinforced because it results in the child no longer crying. On the other hand, the child's demanding behaviour is likely to have been positively reinforced. Evidence for both positive and negative reinforcement is likely to appear when similar circumstances arise at another time.
On many occasions where negative reinforcement is responsible for an operant response, the person concerned views the situation as aversive and therefore unpleasant, often resulting in
counter-control. For this reason, applied behaviour analysts prefer to use positive reinforcement whenever possible.
Sometimes when you observe an organism you might not be able to discern any obvious consequences of their behaviour. In such cases you may be observing well trained avoidance behaviour. For example, if you peered into a Skinner box when a rat was responding on an avoidance schedule, the only thing you would see would be the rat pressing a lever, nothing else.
“A human example might be as follows. We watch a person walking down the street and notice that every time s/he reaches a certain point s/he crosses the road for no obvious reason. While we might hypothesize that there is something positively reinforcing about the other side of the road, it is just as plausible that s/he is avoiding something on the first side but we might not be watching that side of the road. Indeed, if we found nothing positive on the second side we have to consider such avoidance. But the point is that we might find nothing obvious since a few trials with the avoidance can establish a very strong response.” (Guerin, 1994, p. 48)
A similar example is familiar to many teachers when they ask questions during class. It is often the case, especially when the class contains large numbers of students, that questions are met with silence from the whole class. Despite the wide range of personalities in the class, something about the situation ensures that they all behave in the same way. This response has more than likely been reinforced in the past by avoidance conditioning. That is, students avoid the social consequences of getting the wrong answer to a question. In a learning environment, it is to be expected that incorrect answers are given in response to questions; that is why it is called a learning environment. However, rather than being positively reinforced for attempting answers, students often remain silent and it is this behaviour which is negatively reinforced.
It is possible to demonstrate this argument to a class that generally remains silent when questions are asked:
1. Instruction: Tell the class that you will write something on the board that they all must say out loud. Anyone who does not say it will be asked to come to the front of the class.
2. Write the word "Eye" on the board and ask them to say it.
3. Beside this, write the word "kan" and ask them to say both words.
4. Beside these two words, write the word "talk" on the board and ask them to say all three words.
If you follow this procedure your class will be transformed momentarily from a group of different personalities who all behave with silence, to a class of different personalities who all say the same thing. To achieve this effect, you will have arranged an avoidance contingency. Talking out loud is negatively reinforced because it avoids the consequences of not talking out loud. In the short term, this contingency is more powerful that the avoidance contingency that has operated previously. But once you identify contingencies that maintain a problem behaviour you then need to put in place a different set of contingencies, and behaviour analysis has a lot to say about
applications to education.

What role does 'intention' play in the explanation of behaviour? This question applies to all aspects of operant behaviour, not just negative reinforcement. However, we mention it here because it is some ways easier to see how one might be misled into believing that explanations for behaviour are incomplete until reference is made to an intention that precedes the behaviour. Take, for example, a person who puts on a warm coat and avoids the effects of the cold weather when s/he walks outside. Since we all engage in this behaviour it is fairly safe to assume that thoughts like the following are relatively common:
"Hmmm, it's cold outside! I better put on my coat."
So how do we deal with these thoughts in a natural science? That is to say, what role does a natural science assign to thoughts when it comes to developing an explanation for behaviour? One can deal with this by noting simply that a child had to learn the consequences of not putting on a coat and putting on a coat, either by directly experiencing the cold and warmth respectively, or else by instruction from a guardian. Once learned, the cold weather functions as an environmental stimulus (discussed later in the Chapter ‘Stimulus Control’) to produce both the desire to put on the coat and the behaviour of putting on the coat. Since the desire or intention and the overt behaviour are things that a person does, they are all viewed as behaviours that stand in relation to each other in the presence of the appropriate environmental stimulus. In other words, the existence of a relation between covert behaviour (e.g., thinking about, or intending to put on a coat) and overt behaviour (e.g., putting on a coat) is viewed in the same way as other behaviour is viewed, simply as a complex dependent variable. Thinking, then, cannot be said to cause behaviour or to function as an independent variable because it too is a dependent variable, a behaviour. This analysis was covered previously when we looked at the behavioural stream.

Many people in positions of authority use negative reinforcement to control others. They do so mostly by the use of threats; "Do this or else!"
Under these circumstances, people who are unable to counteract such aversive contingencies feel a tremendous amount of resentment toward authority figures.
Now that you know about positive and negative reinforcement, you can use this information to query the psychological principles employed by people in authority. You can ask them first of all to explain the psychological principles behind the decisions taken to control the
working environment of people under their control. You can also ask them to justify the use of negative reinforcement when positive reinforcement is more likely to produce a more harmonious and productive work environment.

Conditioned Aversive Stimuli
As with positive reinforcement, stimuli associated with the use of negative reinforcement acquire distinctive properties. These properties may arise from a combination of
classical and operant conditioning principles. With positive reinforcement, these stimuli are called conditioned reinforcers. Here they are called ‘conditioned aversive stimuli’.
“By their excessive use of punishment, some teachers transform themselves, their classroom, and the learning materials they use into conditioned aversive stimuli. All too frequently, this situation produces individuals who avoid teachers, school, and books and who therefore fail to advance academically. Clearly this is a most unfortunate consequence of escape and avoidance conditioning.” (Martin & Pear, 1996, pps 183-184)
These powerful effects of negative reinforcement are compounded by the reactions of parents, guardians, and other professional not trained in behaviour analysis. Quite often it is not recognised that the child is a victim of the contingencies employed by educational institutions. Instead, the child is berated for their behaviour while the contingencies that brought about their behaviour are left unchanged.

Differential Negative Reinforcement
If a reinforcement procedure for one behaviour is combined with another procedure (e.g., extinction) to reduce the frequency of an undesirable behaviour, the term differential reinforcement is used.
The planned use of
differential negative reinforcement can have positive outcomes. Consider, for example, an experiment by Warzak, Kewman, Stefans, and Johnston (1987):
“Warzak provided treatment for Adam, a 10-year-old boy who reported that he was unable to read following hospitalization for a serious respiratory infection. Prior to the hospitalization, Adam had no trouble reading. Adam now reported that the letters were blurry and moved around on the page when he tried to read. However, he had no difficulty playing video games and engaging in other activities requiring fine visual discriminations.
Warzak implemented treatment consisting of therapeutic reading exercises that lasted from 45 minutes to 2 hours each day. The exercises ‘were designed to be exceedingly tedious and boring’ (p. 173). Adam was asked to read words presented on the page as part of the exercises in each treatment session. When he read the words correctly, the remainder of the therapeutic exercise was cancelled for that day. Correct reading was negatively reinforced by escape from the tedious exercises. If he failed to read words correctly, the therapeutic exercises continued. (This corresponded to extinction of incorrect reading)... The results show that correct reading increased to 100% for all print sizes following the use of differential negative reinforcement.” (Miltenberger, 1997, pps 165-167).

Figure 5.1

Further examples

Punishment + (Positive)

This heading seems like a contradiction in terms. However, remember the earlier lesson. Think of the behavioural effect first and then think about the procedure. Here, Punishment refers simply to a decrease in the probability of behaviour, whatever the procedure (Fig. 5.2). The term ‘positive’ in this context refers to a particular kind of contingency such that behaviour produced something in the environment. For most people the term 'punishment' conjours up images that involve the use of aversive or painful procedures for dealing with particular kinds behaviour. Examples include smacking children, sending people to prison, or imposing fines. In each of these three situations, those who apply the punishment do so because they want to reduce the likelihood that specific behaviours will occur again in the future.
Although these procedures are undoubtedly very potent, there are many more circumstances in every-day life where reductions in the future likelihood of behaviour occur without '
aversives' being employed. A few examples will illustrate this:
1. You are sitting and watching the television. Suddenly the picture on your television screen disappears due to a fault. If the picture does not return, you will eventually stop sitting and looking at the TV screen.
2. Consider the case of Jocey, a chatterbox, who, after being told off by the teacher, stops speaking during morning assembly in school.
3. Hypochondriac Jim stopped complaining about his ailments when people no longer paid attention to his complaints.
4. Sadie stopped paying attention to her appearance when, after her wedding, nobody complemented her on her looks anymore and her husband said he loved her anyway.
5. Little Jamie stopped waking during the night after his parents decided that he was not to get a midnight bottle anymore.
6. Johnny stopped whining at the check-out counter in the supermarket after his mother decided that he was only to get sweets on Saturday evenings after his bath.
Despite many more possible examples, a careful analysis will reveal that they all have something in common. Firstly, certain behaviours are reduced in frequency or eliminated entirely. Secondly, certain consequences are responsible for this change in behaviour.
We then arrive at a definition of '
punishment' that is substantially different from that used in everyday language.
Behaviour analysis reserves the term 'punishment' for those instances when we observe a decrease in the probability of behaviour as a result of certain consequence(s) of that behaviour. Viewed in this way we say that the definition of punishment is a
functional definition because the decrease in behaviour is a function of the consequences; note, reinforcement is also functionally defined. On those occasions when consequences do not reduce the likelihood of behaviour, it makes sense to alter these consequences until the desired outcome is attained.
A major advantage of a functional definition of punishment is that it creates opportunities for developing behaviour change procedures that do not rely on the use of aversives.
Take, for example, the use of non-contingent reinforcer delivery. If the behaviour of responding on a lever has been reinforced with food, deliveries of 'free food' will reduce the likelihood of lever pressing. In other words, when the contingency between lever pressing and food delivery is contaminated with non-contingent food delivery, a punishment effect is observed, i.e., there is a reduction in lever pressing.
Findings like this, coupled with a functional definition of punishment raise questions about many procedures that are commonly used to punish behaviour. In many cases, it is possible to see that the so-called punishment procedure is more accurately viewed as a form of retribution. For example, it is often the case that school records of children who have been given detention after school (in an attempt to punish a misdemeanour) show that the same children are repeatedly in detention. Using a functional definition of punishment, it is easy to see that ‘detention’ is not functioning as a punisher. In some cases, it may actually function as a reinforcer for the very behaviour that is to be eliminated (i.e., if there is an increase in the behaviour, then the functional definition of reinforcement tells us that we are witnessing a reinforcement effect).
A functional definition of punishment has implications for the design of therapy. For example, muscular tics can be reduced in frequency by exercising incompatible muscles; life threatening ruminative vomiting in babies has been reduced by using brief squirts of lemon juice; vandalism in a car park has been reduced by putting up daily signs indicating the incidence of vandalism; sibling aggression can be reduced by ignoring it while praising co-operative sibling behaviour.

It is often thought that punishment must be aversive to be effective. As we have already seen, in behaviour analysis the term punishment is functionally defined, i.e., the effect a consequence has on the future probability of the behaviour occurring again determines the label given to that outcome. If the effect is a reduction in behaviour the consequence is termed 'punisher'.
If a punishment procedure is deemed necessary in applied behaviour analysis, the least
aversive punishment procedure is usually the preferred method of intervention. This means that before using an aversive procedure, all other possibilities have to be considered, tested and found non-effective. In most cases non-aversive procedures are in fact found to be as effective, if not more so, than aversive procedures.

Over 90% of the population have been smacked as children and the great majority of parents admit to smacking their own children at one time or another, despite the fact that smacking is illegal in many countries. For example, in the UK the debate about whether parents have a right to smack their children or not can become heated. Those who are for the use of smacking as an educational method usually accuse those who are against it of being liberalists who let their children run wild. This is obviously not the case.
The focus of behaviour analysis is not about whether parents have a right to physically punish their children, but about answering the question “What is the most effective way to deal with problem or undesired behaviours in children?” Throughout this tutorial you have been introduced to the effects that consequences have on behaviour. There are a full range of consequences that can be applied successfully when children misbehave and that, at the same time, do not inflict physical pain. For example,
differential reinforcement of alternative behaviours, extinction, response cost procedures, time-out from positive reinforcement, stimulus control procedures, observational learning and reasoning and rules are all viable alternatives to inflicting physical pain or coercion in order to suppress undesired behaviours.

Reprimands, criticism or shouting are punishment procedures most often used by parents, teachers and others. Reprimands can be effective if initially used with other punishing events such as time-out or response cost. Eventually reprimands may become effective on their own. However, parents and teachers often expect reprimands to work without prior training and are surprised when this does not happen. However, if effectively trained, reprimands are usually an acceptable, non-aversive behaviour reduction procedure.

Strange as it may seem, there are those who object to the planned use of behavioural principles. They often argue that it is unethical to deliberately set out to control the behaviour of another individual. However, in any social situation it is not possible to interact without arranging consequences for each other’s behaviour. The normal to-and-fro of a conversation involves mutual control. Is it not better to be aware of how we control, and who controls us so we can keep people accountable for how they control? Ethical considerations should undoubtedly be at the forefront of all treatment considerations. In fact, because of the powerful effects obtained from the application of behavioural principles, behaviour analysts deal with ethical questions in great detail. Most books on applied behaviour analysis include a chapter on ethical considerations. And remember, behavioural principles are not invented by mad scientists, they are discovered by scientists. Once they are discovered, it makes sense to take advantage of them to facilitate change when it is needed, either in oneself, or if needed by another person.
While the main ethical questions refer to consent of patients, choice of target behaviour and choice of intervention, further questions posed by behaviour analysts address issues such as accountability and effectiveness. Carefully designed and properly monitored behavioural interventions usually exceed other methods in effectiveness and accountability to the patient, the carers, the agency and the scientific community. The ethical question behaviour analysts, therefore, ask is whether or not it is ethical to use methods of intervention that are less effective when more effective methods are available. Behaviour analysts state as one of their ethical goals "the client's right to effective treatment".

Target Behaviour
Target behaviours must be agreed between client (parent or carer) and behaviour analysts prior to intervention.
Before targeting a behaviour for reduction, you should determine who will benefit from the reduction. Will the person whose behaviour is to be reduced benefit or those working or living with him or her, or caring for him or her? Ethically responsible decisions have to be taken before target behaviours are defined and behaviour reduction procedures are implemented.

There are a range of problems with the use of aversive procedures. For example, the deliberate infliction of physical pain, such as a smack or a push, can lead to imitation of such behaviours. It is often observed that children who experience being smacked will use the same method when dealing with problems presented by younger children.
Physical pain also causes vigorous responding, such as intense crying or lashing out. While a child is engaged in these kind of behaviours he or she will not be able to learn appropriate behaviours. On the other hand, physical pain can cause response suppression, i.e., the person suffering the pain does not respond at all, stiffens up or withdraws into himself. Again, in this situation no new learning can take place. The only learning that can take place when physical pain is inflicted is the avoidance of both the pain and the people who inflict the pain.
This avoidance conditioning is often accompanied by negative emotional conditioning. In other words, the person who experiences the pain begins to avoid and dislike the person who inflicts the pain.
One of the main problems with punishment in general is that by definition it reduces behaviour but no new or more desirable behaviours are established in procedures that rely on punishment alone.

An undesirable side effect of punishment can be that while observable behaviours are reduced, private behaviours such as negative emotions and fearfulness may increase.
These kinds of
emotional behaviours are not only unpleasant, they can actually interfere with the effectiveness of the behaviour change procedure. As a result, the person who is punished may not be receptive to learning from those who punish them.

It has been observed that aggressive behaviours can occur when pain is inflicted on a person or animal. This is clearly an undesirable side effect of punishment and indicates that punishment that inflicts physical pain should be avoided at all costs.

Opponents of behavioural methods often claim that reducing one undesired behaviour will only lead to symptom substitution, i.e., another undesired behaviour will develop in its place. This supposition is not supported by research. The only time new undesired behaviours may occur is if the behavioural intervention was not designed, implemented, or monitored correctly, thoroughly and flexibly. When punishment procedures are used as part of an overall programme, this will also include procedures aimed at substituting undesired behaviours with desired ones.

When using punishment procedures it is important to remember the effects of imitation learning.
Modelling or imitation will include those behaviours that are intended to provide consequences for undesired behaviours. For example, a parent who smacks her child should not be surprised to find her child smacking other children.

Stimulus Control
Just as stimuli can acquire discriminative properties when behaviours that occur in their presence are reinforced, they can acquire discriminative properties when behaviours that occur in their presence are punished. Speed signs are a good example. Once you have been heavily fined when speeding, you will be more likely to reduce your speed when you drive past a speed restriction sign. Stimulus control procedures can be useful in achieving non-aversive behaviour reduction.

Habit Reversal
Habit reversal is a comprehensive treatment procedure that has shown to be highly effective to reduce nervous muscle and vocal tics including eye blinking, mouth twitching, hair pulling and other habit related disorders. Amongst other things, it includes an activity punishment procedure.
The person is required repeatedly to engage in a response that is competing and incompatible with the undesired response; e.g., in the treatment of head jerks the patient may be required to engage in neck muscle contraction, a response incompatible with head jerking.

Figure 5.2

Further examples

Punishment - (Negative)

Now, before you read on, you should be able to work this one out for yourself. What is it you have to remember about the terms?

When the result of withdrawing a stimulus is a decrease in the probability of behaviour occurring again, we speak of
negative punishment. It is important to remember that this functional definition does not refer to any specific stimulus and there are, of course, individual differences in how people respond to different stimuli (Fig. 5.3).

However, for most people negative punishment is experienced as unpleasant and behaviour analysts generally prefer to use positive reinforcement contingencies where at all possible. Yet, there may be situations in everyday life in which negative punishment contingencies prevail. The three most common occurrences of negative punishment are situations where a specific amount of tangible or symbolic reinforcers are withdrawn (response cost), situations where reinforcers are withdrawn for a certain period of time (time-out from positive reinforcement), and situations where reinforcers are withdrawn forever (extinction).

‘Response cost’ is a term used for situations in which a specific amount of tangible or symbolic reinforcers are withdrawn resulting in a decrease in behaviour. Examples include:

  • pocket money reduction
  • withholding of privilege
  • less or no sweets
  • yardage penalty in football
  • speeding fine
  • lowering of grade in school
  • loss of free time
  • point penalty system
  • removal of tokens
  • no dessert today

‘Time-out’ from positive reinforcement is a term used for situations in which reinforcers are withdrawn for a specific time period resulting in a decrease in behaviour. Examples include:

  • When children fight over toy, parent takes toy away until fighting stops.
  • When children argue over candy, the candy is withdrawn until the argument stops.
  • When people fall out over TV programmes, the TV is switched off, until fighting stops.

When the reinforcer is difficult to identify or remove, there are times when it seems easier to remove the person from the reinforcing environment for a short period of time. For example:

  • A child who misbehaves has to sit in 'naughty' chair for a short period of time.
  • A person who behaves aggressively is put in a non-stimulating room for a short time.
  • A child who vandalises the living room is put in hallway or corner for a few minutes.

Removing the person from all potential reinforcers (as in the examples above) is a contested procedure. It may be deemed necessary because the reinforcers for the problem behaviour cannot be identified, or because the reinforcer cannot be removed, as in a classroom, for example, when a child misbehaves and his peers are laughing (and this laughing functions as reinforcer for misbehaviour). In any case, it is important to make sure the time-out from positive reinforcement procedure is effective, i.e., if you find the same person in a time-out room/chair a lot of the time, then it is obviously not working.

Extinction’ is a term used for situations in which reinforcers are withdrawn, resulting in a decrease in behaviour.

Examples of social extinction: withdrawing attention (ignoring) from a misbehaving child; withdrawing sweets from the shop counter to reduce temper tantrums; the loss of a loved one leads to reduction of behaviours typical for your relationship with him/her; the loss of specific job leads to a reduction of behaviours typical for you carrying out this job.

Examples of non-social extinction: if the light switch in bathroom has been moved, the behaviour of looking for it in the old place decreases; when you take your watch off during your holidays,
the behaviour of looking at your wrist to check the time decreases.

Examples of sensory extinction: if your pen runs out of ink, you will stop using it; if your eyesight is failing, you will stop relying on vision; if your hearing is failing you will stop relying on auditory stimuli.

Parents are often told to 'ignore' certain unwanted behaviours in their children in order to reduce them. However, putting behaviours on extinction is not the easiest behaviour reduction procedure. Firstly, during the process of extinction parents should be told to expect a so-called 'extinction burst', i.e., it gets worse before it gets better. For example, when putting a problem behaviour such as temper tantrums on extinction, parents usually have to contend with much worse tantrums first before they decrease. If undesired behaviour is accidentally reinforced during this extinction burst, i.e., if parents 'give in' to these major tantrums, the undesired behaviour will increase and the problem will be extenuated rather than eradicated.

Furthermore, a phenomenon called 'spontaneous recovery' must be expected. This means that even if the undesired behaviour is reduced or eradicated it may spontaneously reappear for a short period of time at a later stage. Parents may be surprised or disappointed and accidentally reinforce the 'recovered' behaviour problem. Needless to say, this can cause untold problems.

Extinction can, however, be very useful when used in conjunction with positive reinforcement procedures that aim to increase desired alternative behaviours.

Conditioned Punisher
Punishment can lead to the person who applies the punishment becoming a conditioned punisher. This is particularly important when it is the parent who does the punishing. The child may learn to avoid the parent and not only the punishment. At the same time physical punishment can be effective in the short-term in suppressing the undesired behaviour. The parent’s behaviour thus may be reinforced for using such an aversive method and they are likely to use it again in future, i.e., they fall into the so-called ‘negative reinforcement trap’ when the parent’s behaviour of smacking or giving into a child’s nagging is negatively reinforced by the temporary removal of the child’s misbehaviour.

Differential reinforcement of alternative behaviour (DRA or ALT-R) is an effective behaviour reduction procedure in which any specific alternative to the undesired behaviours is reinforced.

A special category of DRA is Differential reinforcement of incompatible behaviour (DRI), where a behaviour that is incompatible with the undesired behaviour is selected for reinforcement. For example, co-operative behaviour is reinforced in a programme designed to reduce aggressive behaviour because it is incompatible with the undesired behaviour.

Differential reinforcement of zero rates of the undesired behaviours (DRO), or omission training, is a method in which purely the absence of the unwanted behaviour is reinforced.

Differential reinforcement methods are constructive, benign and acceptable to most clients. They produce lasting change, although it may be necessary to supplement them with other behavioural procedures such as modelling, physical prompting or extinction.

Instructions & Rules
Although punishment as well as reinforcement can be effective without instruction or rules, verbal instructions are often included in behaviour change procedures. Instructions should specify the undesired response as well as the desired response. “I don't want you to do X. I want you to do Y”. Instructions should also specify why the response is not desired. "I don't want you to play with the ball indoors, because the light fitting may get broken. You can play with the ball outside”.

In the analysis of the role of rules and instructions, it is important to remember how language was acquired. All too often we forget that the meaning of words is handed down from generation to generation often without knowledge of what exactly each word means to the learner. How does a parent teach the word 'love' to a child? The parent will never know exactly what the child feels when they teach him that what he is feeling is called 'love'.

Thus, when instructions or rules are ineffective in changing behaviour, we should not get annoyed with the child but consider that maybe the words were not learnt properly. Rules and instructions are only as good as the learning process that was employed to establish them

Like reinforcement, punishment is more effective if applied without delay, immediately following the undesired behaviour and every time the offense occurs. However, the value of punishment will lessen with overuse. You should avoid extended periods of punishment and introduce a variety of punishers rather than rely on one particular punisher. Better still, combine punishment procedures with differential reinforcement of desired behaviours thereby shortening the time span that necessitates punishment procedures.

When one behaviour is reduced, another behaviour increases. When we stop walking, we start standing; when we stop speaking, we start silence; when we stop sleeping, we start waking. This is the 'Yin' and 'Yang' of behaviour or it's 'covariation'.

In applied behaviour analysis we can use this covariation effect by focussing on both the behaviours to be reduced and those to be increased, thus increasing treatment success.

New Behaviour
Punishment does not establish new behaviours. As such we must remember to use punishment procedures always in conjunction with procedures designed to establish alternative, new behaviours
This will not only enhance the effectiveness of therapy but also ensure long-term maintenance of the changes that were achieved. We must remember that it is the behaviour that is undesirable not the behaver.

The effectiveness of the punisher must be the main consideration in the choice of punisher in therapy. We cannot assume that a certain consequence functions as punisher for the behaviour in question just because it functioned that way for another person's behaviour or even for another behaviour in the same person. The more effective a punisher is at the outset the less likely it will have to be intensified later on.

In ‘overcorrection’ procedures the person is required to practice the wanted behaviour over and over again. This procedure can be effective in certain situations and because of its educative and practice component it is often preferred by school personnel, parents, and the public. Unlike other punishment procedures over-correction does include learning desired and constructive behaviours.

A simple overview of the basic principles described here is shown in Movie 5.3.

Figure 5.3

Further examples

Motivating operation
The reinforcing effectiveness of any consequence varies from time to time. In lay vocabulary we refer to this fact when we say that someone is/isn't motivated.
“For example, water is likely to be an effective reinforcer if someone has been deprived of water for some time, but it is less likely to be so after a large quantity of water has been consumed. The things that can be done to change the effectiveness [i.e., environmental manipulations] are called establishing operations. Deprivation and satiation are two examples, but they are not the only possibilities.’ (Catania, 1994, p. 26)

During the shaping exercise shown earlier with the student, we saw how instructions ensured that a simple buzzer functioned as a reinforcer for his behaviour. We could say that this instruction functioned as an establishing operation for the buzzer. A motivating operation is defined as an event that momentarily alters (1) the reinforcement properties of a reinforcer and (2) the frequency of occurrence of the behaviour relevant to those consequences. In principle there are two kinds of motivating operations; an establishing operation increases the reinforcing properties of a reinforcer, while an abolishing operation decreases the reinforcing properties of an environmental event. In other words, something can happen that makes a reinforcer stronger (establishing), for example on a hot day, a cool drink may be a stronger reinforcer than usually. On the other hand, something can happen that makes a reinforcer weaker (abolishing), for example on a wintery cold day a cool drink will not have the same reinforcing effect as in the summer time.

Motivation operations can be unconditioned (such as deprivation or satiation) or conditioned, with value-altering effects that are a result of the organism’s learning history.

In sum, Reinforcement and Punishment are functionally defined. That is, the observed change in probability of behaviour determines whether the label Reinforcement or Punishment is to be used (see summary Movie 5.3).

Additional Video Resources