Any Event That Follows a Response and Decreases the Likelihood of Its Occurring Again Is

Chapter 8. Learning

8.2 Irresolute Behaviour through Reinforcement and Punishment: Operant Conditioning

Learning Objectives

  1. Outline the principles of operant workout.
  2. Explicate how learning can be shaped through the use of reinforcement schedules and secondary reinforcers.

In classical conditioning the organism learns to associate new stimuli with natural biological responses such as salivation or fear. The organism does not learn something new just rather begins to perform an existing behaviour in the presence of a new signal. Operant workout, on the other paw, is learning that occurs based on the consequences of behaviour and can involve the learning of new actions. Operant conditioning occurs when a dog rolls over on command because information technology has been praised for doing so in the past, when a schoolroom bully threatens his classmates because doing and so allows him to get his way, and when a child gets expert grades because her parents threaten to punish her if she doesn't. In operant conditioning the organism learns from the consequences of its ain deportment.

How Reinforcement and Punishment Influence Behaviour: The Research of Thorndike and Skinner

Psychologist Edward L. Thorndike (1874-1949) was the offset scientist to systematically study operant conditioning. In his research Thorndike (1898) observed cats who had been placed in a "puzzle box" from which they tried to escape ("Video Clip: Thorndike'southward Puzzle Box"). At first the cats scratched, bit, and swatted haphazardly, without whatever idea of how to get out. But eventually, and accidentally, they pressed the lever that opened the door and exited to their prize, a scrap of fish. The adjacent time the cat was constrained within the box, it attempted fewer of the ineffective responses before carrying out the successful escape, and after several trials the cat learned to almost immediately make the right response.

Observing these changes in the cats' behaviour led Thorndike to develop his law of result, the principle that responses that create a typically pleasant issue in a item situation are more probable to occur again in a similar situation, whereas responses that produce a typically unpleasant outcome are less probable to occur again in the situation (Thorndike, 1911). The essence of the law of effect is that successful responses, considering they are pleasurable, are "stamped in" by experience and thus occur more than ofttimes. Unsuccessful responses, which produce unpleasant experiences, are "stamped out" and later occur less frequently.

When Thorndike placed his cats in a puzzle box, he found that they learned to appoint in the important escape behaviour faster after each trial. Thorndike described the learning that follows reinforcement in terms of the law of effect.

"" Watch: "Thorndike'due south Puzzle Box" [YouTube]: http://www.youtube.com/watch?v=BDujDOLre-8

The influential behavioural psychologist B. F. Skinner (1904-1990) expanded on Thorndike'due south ideas to develop a more complete set of principles to explain operant conditioning. Skinner created specially designed environments known as operant chambers (unremarkably called Skinner boxes) to systematically study learning. A Skinner box (operant bedroom) is a construction that is large plenty to fit a rodent or bird and that contains a bar or primal that the organism tin can press or peck to release nutrient or water. It besides contains a device to record the animal'southward responses (Figure viii.five).

The nigh basic of Skinner'south experiments was quite similar to Thorndike's research with cats. A rat placed in the bedroom reacted as 1 might expect, scurrying about the box and sniffing and clawing at the floor and walls. Eventually the rat chanced upon a lever, which information technology pressed to release pellets of food. The next time around, the rat took a niggling less time to press the lever, and on successive trials, the time it took to press the lever became shorter and shorter. Soon the rat was pressing the lever equally fast every bit it could eat the nutrient that appeared. Equally predicted by the police of result, the rat had learned to repeat the activity that brought almost the food and end the actions that did non.

Skinner studied, in item, how animals inverse their behaviour through reinforcement and punishment, and he developed terms that explained the processes of operant learning (Table 8.1, "How Positive and Negative Reinforcement and Punishment Influence Behaviour"). Skinner used the term reinforcerto refer to whatsoever event that strengthens or increases the likelihood of a behaviour, and the term punisher to refer to whatsoever consequence that weakens or decreases the likelihood of a behaviour. And he used the terms positive and negative to refer to whether a reinforcement was presented or removed, respectively. Thus, positive reinforcement strengthens a response by presenting something pleasant afterward the response, and negative reinforcement strengthens a response by reducing or removing something unpleasant. For case, giving a child praise for completing his homework represents positive reinforcement, whereas taking Aspirin to reduce the pain of a headache represents negative reinforcement. In both cases, the reinforcement makes it more likely that behaviour will occur again in the futurity.

""
Figure viii.5 Skinner Box. B. F. Skinner used a Skinner box to study operant learning. The box contains a bar or key that the organism can press to receive food and h2o, and a device that records the organism'southward responses.
Tabular array 8.1 How Positive and Negative Reinforcement and Penalization Influence Behaviour.
[Skip Table]
Operant conditioning term Description Outcome Example
Positive reinforcement Add or increment a pleasant stimulus Behaviour is strengthened Giving a student a prize after he or she gets an A on a examination
Negative reinforcement Reduce or remove an unpleasant stimulus Behaviour is strengthened Taking painkillers that eliminate pain increases the likelihood that you will take painkillers once more
Positive punishment Present or add an unpleasant stimulus Behaviour is weakened Giving a student extra homework after he or she misbehaves in form
Negative punishment Reduce or remove a pleasant stimulus Behaviour is weakened Taking abroad a teen's computer after he or she misses curfew

Reinforcement, either positive or negative, works by increasing the likelihood of a behaviour. Punishment, on the other hand, refers to whatever result that weakens or reduces the likelihood of a behaviour. Positive punishmentweakens a response past presenting something unpleasant subsequently the response, whereas negative punishmentweakens a response by reducing or removing something pleasant. A child who is grounded after fighting with a sibling (positive punishment) or who loses out on the opportunity to go to recess later on getting a poor grade (negative punishment) is less likely to echo these behaviours.

Although the distinction between reinforcement (which increases behaviour) and penalization (which decreases it) is unremarkably clear, in some cases it is hard to decide whether a reinforcer is positive or negative. On a hot day a absurd breeze could be seen every bit a positive reinforcer (considering it brings in cool air) or a negative reinforcer (considering it removes hot air). In other cases, reinforcement can exist both positive and negative. I may smoke a cigarette both because it brings pleasure (positive reinforcement) and because it eliminates the craving for nicotine (negative reinforcement).

It is too important to notation that reinforcement and punishment are not simply opposites. The use of positive reinforcement in changing behaviour is well-nigh e'er more effective than using penalisation. This is because positive reinforcement makes the person or animal feel better, helping create a positive relationship with the person providing the reinforcement. Types of positive reinforcement that are effective in everyday life include exact praise or approval, the awarding of status or prestige, and direct financial payment. Punishment, on the other paw, is more than probable to create just temporary changes in behaviour because it is based on coercion and typically creates a negative and adversarial relationship with the person providing the reinforcement. When the person who provides the punishment leaves the situation, the unwanted behaviour is likely to return.

Creating Circuitous Behaviours through Operant Workout

Perhaps you remember watching a picture show or being at a show in which an beast — perhaps a dog, a equus caballus, or a dolphin — did some pretty astonishing things. The trainer gave a command and the dolphin swam to the bottom of the pool, picked up a ring on its nose, jumped out of the water through a hoop in the air, dived again to the bottom of the pool, picked up some other band, and so took both of the rings to the trainer at the border of the pool. The animal was trained to practise the trick, and the principles of operant conditioning were used to train information technology. But these complex behaviours are a far weep from the elementary stimulus-response relationships that we have considered thus far. How can reinforcement be used to create complex behaviours such as these?

Ane fashion to expand the employ of operant learning is to modify the schedule on which the reinforcement is applied. To this point we take only discussed a continuous reinforcement schedule, in which the desired response is reinforced every time it occurs; whenever the domestic dog rolls over, for instance, information technology gets a biscuit. Continuous reinforcement results in relatively fast learning only besides rapid extinction of the desired behaviour once the reinforcer disappears. The problem is that because the organism is used to receiving the reinforcement later on every behaviour, the responder may give up rapidly when it doesn't appear.

Virtually real-world reinforcers are not continuous; they occur on a partial (or intermittent) reinforcement schedule a schedule in which the responses are sometimes reinforced and sometimes non. In comparison to continuous reinforcement, partial reinforcement schedules lead to slower initial learning, but they also pb to greater resistance to extinction. Because the reinforcement does non appear after every behaviour, it takes longer for the learner to decide that the reward is no longer coming, and thus extinction is slower. The four types of fractional reinforcement schedules are summarized in Table 8.two, "Reinforcement Schedules."

Table 8.2 Reinforcement Schedules.
[Skip Table]
Reinforcement schedule Explanation Real-world instance
Fixed-ratio Behaviour is reinforced afterward a specific number of responses. Manufactory workers who are paid according to the number of products they produce
Variable-ratio Behaviour is reinforced afterwards an average, but unpredictable, number of responses. Payoffs from slot machines and other games of chance
Stock-still-interval Behaviour is reinforced for the outset response afterwards a specific amount of time has passed. People who earn a monthly salary
Variable-interval Behaviour is reinforced for the kickoff response after an average, simply unpredictable, amount of time has passed. Person who checks email for messages

Fractional reinforcement schedules are adamant by whether the reinforcement is presented on the ground of the time that elapses between reinforcement (interval) or on the basis of the number of responses that the organism engages in (ratio), and by whether the reinforcement occurs on a regular (fixed) or unpredictable (variable) schedule. In a fixed-interval schedule, reinforcement occurs for the beginning response made after a specific corporeality of time has passed. For case, on a one-minute stock-still-interval schedule the beast receives a reinforcement every minute, assuming it engages in the behaviour at to the lowest degree in one case during the minute. As you can see in Effigy 8.half-dozen, "Examples of Response Patterns past Animals Trained under Different Fractional Reinforcement Schedules," animals under stock-still-interval schedules tend to deadening down their responding immediately after the reinforcement but then increase the behaviour over again as the time of the next reinforcement gets closer. (Nigh students written report for exams the same way.) In a variable-interval schedule, the reinforcers announced on an interval schedule, only the timing is varied around the boilerplate interval, making the actual appearance of the reinforcer unpredictable. An example might exist checking your email: yous are reinforced by receiving messages that come, on boilerplate, say, every 30 minutes, but the reinforcement occurs but at random times. Interval reinforcement schedules tend to produce deadening and steady rates of responding.

""
Figure eight.six Examples of Response Patterns by Animals Trained nether Unlike Partial Reinforcement Schedules. Schedules based on the number of responses (ratio types) induce greater response rate than practice schedules based on elapsed fourth dimension (interval types). Also, unpredictable schedules (variable types) produce stronger responses than practise predictable schedules (fixed types).

In a stock-still-ratio schedule, a behaviour is reinforced afterwards a specific number of responses. For instance, a rat's behaviour may be reinforced after it has pressed a key 20 times, or a salesperson may receive a bonus afterward he or she has sold 10 products. As you tin meet in Figure 8.six, "Examples of Response Patterns by Animals Trained under Different Partial Reinforcement Schedules," once the organism has learned to act in accordance with the stock-still-ratio schedule, it will pause just briefly when reinforcement occurs earlier returning to a loftier level of responsiveness. A variable-ratio scheduleprovides reinforcers after a specific simply boilerplate number of responses. Winning money from slot machines or on a lottery ticket is an case of reinforcement that occurs on a variable-ratio schedule. For instance, a slot machine (see Effigy viii.7, "Slot Car") may exist programmed to provide a win every xx times the user pulls the handle, on average. Ratio schedules tend to produce loftier rates of responding because reinforcement increases as the number of responses increases.

""
Figure eight.7 Slot Car. Slot machines are examples of a variable-ratio reinforcement schedule.

Complex behaviours are also created through shaping, the process of guiding an organism's behaviour to the desired result through the utilize of successive approximation to a final desired behaviour. Skinner made extensive use of this procedure in his boxes. For instance, he could railroad train a rat to printing a bar 2 times to receive food, by first providing food when the animal moved near the bar. When that behaviour had been learned, Skinner would begin to provide food only when the rat touched the bar. Further shaping limited the reinforcement to but when the rat pressed the bar, to when information technology pressed the bar and touched information technology a 2d time, and finally to merely when information technology pressed the bar twice. Although it can take a long time, in this way operant conditioning can create chains of behaviours that are reinforced merely when they are completed.

Reinforcing animals if they correctly discriminate between similar stimuli allows scientists to test the animals' ability to larn, and the discriminations that they tin make are sometimes remarkable. Pigeons have been trained to distinguish between images of Charlie Brown and the other Peanuts characters (Cerella, 1980), and between dissimilar styles of music and art (Porter & Neuringer, 1984; Watanabe, Sakamoto & Wakita, 1995).

Behaviours can besides exist trained through the use of secondary reinforcers. Whereas a main reinforcer includes stimuli that are naturally preferred or enjoyed past the organism, such as food, water, and relief from pain, a secondary reinforcer (sometimes called conditioned reinforcer) is a neutral event that has become associated with a primary reinforcer through classical conditioning. An example of a secondary reinforcer would be the whistle given by an animal trainer, which has been associated over time with the primary reinforcer, food. An example of an everyday secondary reinforcer is money. We enjoy having coin, not so much for the stimulus itself, just rather for the primary reinforcers (the things that money tin purchase) with which it is associated.

Key Takeaways

  • Edward Thorndike adult the law of consequence: the principle that responses that create a typically pleasant outcome in a particular situation are more probable to occur over again in a similar situation, whereas responses that produce a typically unpleasant outcome are less likely to occur once more in the situation.
  • B. F. Skinner expanded on Thorndike's ideas to develop a set of principles to explain operant conditioning.
  • Positive reinforcement strengthens a response past presenting something that is typically pleasant after the response, whereas negative reinforcement strengthens a response past reducing or removing something that is typically unpleasant.
  • Positive punishment weakens a response past presenting something typically unpleasant after the response, whereas negative punishment weakens a response by reducing or removing something that is typically pleasant.
  • Reinforcement may be either partial or continuous. Partial reinforcement schedules are determined by whether the reinforcement is presented on the basis of the time that elapses betwixt reinforcements (interval) or on the ground of the number of responses that the organism engages in (ratio), and by whether the reinforcement occurs on a regular (stock-still) or unpredictable (variable) schedule.
  • Complex behaviours may exist created through shaping, the procedure of guiding an organism's behaviour to the desired issue through the use of successive approximation to a concluding desired behaviour.

Exercises and Critical Thinking

  1. Give an case from daily life of each of the following: positive reinforcement, negative reinforcement, positive punishment, negative punishment.
  2. Consider the reinforcement techniques that you might use to train a dog to take hold of and retrieve a Frisbee that you lot throw to it.
  3. Watch the post-obit 2 videos from electric current goggle box shows. Tin can you determine which learning procedures are being demonstrated?
    1. The Office: http://world wide web.break.com/usercontent/2009/11/the-office-altoid- experiment-1499823
    2. The Big Bang Theory [YouTube]: http://www.youtube.com/watch?v=JA96Fba-WHk

References

Cerella, J. (1980). The pigeon's analysis of pictures.Design Recognition, 12, 1–6.

Kassin, S. (2003). Essentials of psychology. Upper Saddle River, NJ: Prentice Hall. Retrieved from Essentials of Psychology Prentice Hall Companion Website: http://wps.prenhall.com/hss_kassin_essentials_1/xv/3933/1006917.cw/index.html

Porter, D., & Neuringer, A. (1984). Music discriminations past pigeons.Journal of Experimental Psychology: Animal Behavior Processes, 10(two), 138–148.

Thorndike, Eastward. L. (1898).Creature intelligence: An experimental study of the associative processes in animals. Washington, DC: American Psychological Association.

Thorndike, E. 50. (1911).Creature intelligence: Experimental studies. New York, NY: Macmillan. Retrieved from http://www.archive.org/details/animalintelligen00thor

Watanabe, South., Sakamoto, J., & Wakita, Thou. (1995). Pigeons' bigotry of painting past Monet and Picasso.Periodical of the Experimental Assay of Behaviour, 63(2), 165–174.

Image Attributions

Figure 8.5: "Skinner box" (http://en.wikipedia.org/wiki/File:Skinner_box_photo_02.jpg) is licensed nether the CC BY SA 3.0 license (http://creativecommons.org/licenses/by-sa/three.0/deed.en). "Skinner box scheme" by Andreas1 (http://en.wikipedia.org/wiki/File:Skinner_box_scheme_01.png) is licensed nether the CC BY SA 3.0 license (http://creativecommons.org/licenses/past-sa/3.0/deed.en)

Figure eight.half-dozen: Adapted from Kassin (2003).

Effigy viii.7:  "Slot Machines in the Difficult Stone Casino" by Ted Murpy (http://eatables.wikimedia.org/wiki/File:HardRockCasinoSlotMachines.jpg) is licensed under CC By ii.0. (http://creativecommons.org/licenses/by/two.0/deed.en).

nelsonyousba.blogspot.com

Source: https://opentextbc.ca/introductiontopsychology/chapter/7-2-changing-behavior-through-reinforcement-and-punishment-operant-conditioning/

0 Response to "Any Event That Follows a Response and Decreases the Likelihood of Its Occurring Again Is"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel