Reinforcement Learning

What is Reinforcement?
When you do something over and over and learn from it.

What is Learning?
When information goes into your brain

How do we Combine the Two? (Reinforcement Learning)
We put information into our brain and make sure it stays there We put information into your brain over and over again until you get it

The Interconnectedness of RL and Intelligence
● The Oxford Dictionary defines intelligence as the ability to acquire and apply knowledge and skills
● Reinforcement learning acquires knowledge as machines find the best possible behavior, which is done by learning from mistakes
● Has ties to psychology and methods of learning

Psychology Ties
● Operant conditioning is a type of associative learning process used by many biological organisms, where the strength of a behavior is modified by reinforcement or punishment. Reinforcement learning uses the same principles of reinforcement in parent conditioning
● Positive Reinforcement is when an event follows a particular behavior, strengthening it and making sure it occurs again. I.e Giving a dog a treat after it sits ○ Maximizes performance ○ Too much can diminish results
● Negative reinforcement Strengthens behavior by removing stimuli following a behavior i.e the seat-belt sound disappearing after you put a seat belt on ○ Increases behavior ○ Helps encourage minimum behavior

What is RL in the field of CS?
“how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward” - Wikipedia

What is RL in the field of CS?
Based on the reward hypothesis:
Any goal can be formalized as the outcome of maximizing a cumulative reward
Example problems and their rewards: ● Flying a helicopter → air time, distance ● Managing an investment portfolio → gains, lack of risk ● Playing a video/board game → winning, maximizing score

