- This event has passed.
PhD Exit Seminar for Annik Yalnizyan-Carlson (Richards Lab)
December 13, 2021 @ 2:10 pm - 3:00 pm
Episodic Control: The Role of Memory in Decision Making
Abstract
Reinforcement learning (RL) is an area of computer science concerned with trial-and-error learning of optimal behaviour. As such, RL can over a normative framework for understanding learning in animals. However, some aspects of animal behaviour are not well accounted for by traditional RL solution methods. In particular, biological brains are able to learn and relearn quickly, flexibly using memory for specific events (i.e. episodic memory, EM) to solve tasks with minimal experience. Incorporating models of EM in RL can enable rapid learning and facilitate solving long-distance credit assignment problems. However, these models typically treat EM as a pure record-keeping system for sensorimotor data. However, psychology and neuroscience understand EM to have broader capabilities. EM maintains conjunctive representations of spatial, temporal, and sensorimotor elements of experience. Moreover, EM is not an ever-expanding repository of experienced information. Rather, many processes exist to promote both active and passive forgetting, and research suggests this is beneficial for learning. Additionally, experiential information is used by multiple memory systems. Animals may initially rely on memories of individual events to guide behaviour, but experience with repeated stimulus-outcome associations also informs habitual memory systems. Thus, with training, behaviour may come to rely more on learned associations rather than memory for any specific experience. In this work I present an episodic control system for RL navigation tasks which incorporates some of these features of biological EM. I compare this episodic control system against the learning profile of a traditional RL agent and show that episodic control can make use of minimal data to leverage successful behavioural policies in a few-shot manner. I show that episodic control using representations of events which contain relational information over greater performance advantages than unstructured sensory data, especially under conditions where agents experience moderate forgetting. Finally, I explore how the behaviour of agents using episodic control may be used to provide off-policy training data for network-based model-free RL agents, and how this may ameliorate the data-inefficiency inherent to such systems.
—————————————————————-
Join Zoom Meeting
Monday, December 13th, 2021 @ 2:10pm
https://utoronto.zoom.us/j/87849843656
Meeting ID: 878 4984 3656
Host: Blake A. Richards (blake.richards@utoronto.ca)
—————————————————————-
Details
- Date:
- December 13, 2021
- Time:
-
2:10 pm - 3:00 pm