Imagine this.

You are on an island all alone, and you found a cave. Every time you enter the cave, there’s a huge meal waiting for you to eat – there’s roasted turkey, grilled pork and beef, a cheeseburger, pizza, chocolate cake, ice cream, and so much more – always highly adequate for you to survive on the island.

Would you enter the cave again without pausing after getting 10 meals nonstop?

Suddenly, you realize there isn’t always a great meal waiting for you. But after a few more times of visiting an empty cave, another meal will appear again.

Now, would you still enter the cave again 10 times in a row without stopping — after receiving an unknown number of meals?

 

What is Operant Conditioning?

Congratulations! We just did a Skinner Box experiment, also known as the Operant Conditioning Chamber.

What Is Operant Conditioning and How Does It Work?

The Skinner Box, or Operant Conditioning Chamber

This experiment was created and conducted by Burrhus Frederic Skinner, a famous American psychologist. This experiment aims to study the association between behavior and consequences – negative or positive – for that behavior. But before getting into the experiments, let’s go over what an “Operant Behavior is.”

We can summarize the types of behaviors into 2 categories: Respondent behavior and Operant behavior.

Respondent behaviors are those that appear naturally and reflexively, such as feeling tired when you need to rest, or your leg jerking when hit with a hammer. They are the behaviors that don’t require learning. Essentially, they are innate, or inborn behaviors.

On the other hand, operant behaviors can be controlled and taught, and the frequency of these actions determine the consequence that follows. So, for example, you can train your dog not to drink toilet water by directing them to their water bottle, or students will remember 1 + 1 = 2 by receiving a checkmark and an A+ on their math test.

We know what operant behavior is, so now let’s move on to the 4 types of operant conditioning!

 

Operant Conditioning Experiments

Skinner's Operant Conditioning: Rewards & Punishments – Sprouts – Learning  Videos – Social Sciences

The four types of operant conditioning

In Operant Conditioning Theory, there are four quadrants: Positive Reinforcement, Positive Punishment, Negative Reinforcement, and Negative Punishment. But in this article, and for this experiment, we will only talk about reinforcement.

The experimental layout is as follows: there’s a family of 3 hungry rats, and there’s a box. Inside the box, there’s a lever to drop food when it is pushed.

Now, let’s place the first brave, hungry rat inside this box and see what will happen.

  • Experiment No. 1 – Positive Reinforcement
Operant conditioning – Automate experiments | Noldus

Rat No. 1 is placed in the box

We put the hungry rat in the box and before long, the rat figures out that he would receive food from an unknown source every time he pushes the lever. Alrighty! Through operant conditioning, the rat learned that pushing the level means he will be rewarded with food.

Therefore, this simple experiment proved that rats could learn behavior by receiving positive repetitive reinforcements.

And guess what the rat will do in the future? Now, whenever he is hungry, he will push the lever. Woohoo! Awards build habits. The rat learned to push the lever and will not be hungry in the box anymore!

  • Experiment No. 2 – Negative Reinforcements

We are conducting the second experiment with the original rat’s brother. And we added a new function to the box – it will shock the rat with an electric current  when the lever is not pushed down.

Positive Reinforcement Using Operant Conditioning - Maze Engineers

Rat No. 2 (Rat No. 1’s brother) is placed in the box

Again, before long — even quicker than the first time — the rat realized the discomfort from the shock would fade whenever he pushed the lever. So now, the brother rat learned that pushing the lever would mean comfort.

However, do you think the brother rat will keep pushing the lever when we stop shooting the electric currents?

The experimental result shows: no.

In the second rat’s perspective, there is no reason to keep pushing the lever if nothing is causing the uncomfortable sensation.

When the “negative” part is taken away, the behavior will no longer be reinforced and encouraged, leading to an immediate decrease in the likelihood that the second rat will push the lever at the same rate as our first rat.

  • Experiment No. 3 – Variable Ratio Reinforcements

We realize there is a problem if we stop feeding the first rat. He would stop pushing the lever after a while, though not immediately, and wonder why. So in experiment no. 3, we took the sister rat of the rat family to participate in the experiment.

In this experiment, we will give food to the rat when she pushes the lever, but she will not receive food every single time she pushes it (just like the hypothetical island situation from the beginning).

Learn About Behavioral Conditioning | Chegg.com

Rat No. 3 is given food at random times

Guess what will happen now? 

Results reveal that it will take the sister rat around 15 hours to stop pushing the lever, which is much longer than her two brothers.

Woah.

 

So now we realize that presenting rewards at random will elicit the trained response for a longer period of time, even after the reward is no longer presented, since the subject will keep thinking “next time, I’ll receive the reward.”

 

Application in Real Life

Now, it’s time for us to write the lab report – “Why is this helpful to know?”

Let’s focus on the third experiment for a moment. In this small chamber, not only did we model the learning process, but we also modeled the fundamentals of gambling.

For example, in the slot machine, you have a chance to win enormous prizes after pulling that handle and drawing three of the same patterns.

This is an example of mathematical probability. However, if you are in front of the slot machine, it will be tough to objectively believe that there’s this mechanism behind your “luck” – because you want the money at your next pull, and you think you will be “lucky.” Therefore, individual failures (aka empty pulls) will not form a negative punishment to terminate your behavior of pulling the handle. Consequently, you will continue to behave the same, and you will unconsciously continue this habitual action as a result of the “learning process”, yet in this case, it also can be called addiction.

 

Conclusion

The first 2 experiments showed us the most efficient way when we want others or ourselves to learn a behavior. Positive reinforcement may not elicit a response, but it takes longer for the behavior to become extinct completely. On the other hand, negative reinforcement will likely cause an immediate effect, but it will not take long for the action to disappear.

When we conducted the adapted version – experiment no. 3 – we discovered an even more effective way of running the learning process (with the fewest number of rat treats). And from this experiment, we recognized that unpredictability boosted the incentive for the rat to continuously push the lever – the uncertainty is the most irresistible threat in this situation.

Nevertheless, whether negative, positive, reinforcing, or punishing, they are all two-sided coins. They can be used to speed up your learning, yet they can also be used to trade all valuables from you under your eyes.

Categories: Academic