Week Two: Learning Theory And Behavioral Models

Read Complete Research Material



Week Two: Learning Theory and Behavioral Models



Week Two: Learning Theory and Behavioral Models

Theme 

Reinforcement learning is a very attractive idea.  When people first encounter RL, they usually find it compellingly persuasive. But although the last 10 years of RL research have produced deep advances in theory and a range of new algorithms, practical uses are still less widespread than we would all wish. The intense research effort - and considerable progress - in RL theory is partly because RL is a clean and simple abstract model of behavioral learning.

The aim of this workshop is to go back to first principles, in the spirit of Cartesian systematic doubt, and to question whether there may be other plausible abstract models of behavioral learning that could serve as starting points for theoretical research (Corey, 2005).

 

The Deceptive Appeal of Reinforcement Learning

RL is a very appealing theory, for several reasons.

First, RL seems to provide a direct formalization of folk, intuitive notions of teaching with, and learning from rewards and punishments.

Second, RL can be viewed as incremental dynamic programming, and dynamic programming is a general method for solving optimal control problems, with considerable existing theory, and excellent optimization properties.

Third, RL appears biologically plausible: there are `direct' or model-free methods that require only simple incremental updates, very short episodic memory, and no look-ahead. For more sophisticated organisms that are capable of forming predictive models about the effects of actions and of planning ahead, these look-aheads and plans could be progressively compiled into policies, and learning would be faster (Corey, 2005).

Lastly, RL seems to provide what a human designer needs: the designer of an agent usually has some definite intentions of what the agent should achieve, and this may be expressed as real-time performance measure, which may be used as a reward function for RL.  It is not necessarily straightforward to transform design requirements ...
Related Ads