Modular Reinforcement Learning as a Model of Embodied Cognition

The severe speed limitations of the brain's neural circuitry mean that vast amounts of indexing must be performed during development so that appropriate behavioral responses can be rapidly accessed. One way this could happen would be if the brain used some kind of decomposition whereby behavioral primitives could be quickly accessed and combined. This realization motivates our research program, which studies the capabilities of groups of independent sensori-motor Markov decision process (MDP) modules in directing behavior.

We use virtual environments (VEs) to explore the sensitivity of the modular models to environmental complexity. VEs allow the manipulation of environmental parameters in a systematic way. Another component of that we depend on is the use of realistic humanoid avatars that can learn from motion capture human data. Our test settings are urban walking and driving environments with an array of rewarded sites that allows exploration of tradeoffs between the goals of different modules. By manipulating these, together with environmental complexity, we obtain parametric data on the extent to which modular MDPs are useful models.

Speaker: Dana Ballard, PhD, Professor, Department of Computer Science, University of Texas at Austin

Room 489