Rahul Siripurapu, Vihang Prakash Patil, Kajetan Schweighofer, Marius-Constantin Dinu, Thomas Schmied, Luis Eduardo Ferro Diez, Markus Holzleitner, Hamid Eghbal-Zadeh, Michael K Kopp, and Sepp Hochreiter

InfODist

Stages of curriculum learning with InfODist (a) and the effect of using informative rewards (b).

Curriculum learning (CL) is an essential part of human learning, just as reinforcement learning (RL) is. However, CL agents that are trained using RL with neural networks produce limited generalization to later tasks in the curriculum. We show that online distillation using learned informative rewards tackles this problem. Here, we consider a reward to be informative if it is positive when the agent makes progress towards the goal and negative otherwise. Thus, an informative reward allows an agent to learn immediately to avoid states which are irrelevant to the task. And, the value and policy networks do not utilize their limited capacity to fit targets for these irrelevant states. Consequently, this improves generalization to later tasks. Our contributions: First, we propose InfODist, an online distillation method that makes use of informative rewards to significantly improve generalization in CL. Second, we show that training with informative rewards ameliorates the capacity loss phenomenon that was previously attributed to non-stationarities during the training process. Third, we show that learning from task-irrelevant states explains the capacity loss and subsequent impaired generalization. In conclusion, our work is a crucial step toward scaling curriculum learning to complex real world tasks.

Deep Reinforcement Learning Workshop at NeurIPS 2022, 2022-12-09.

Download
View paper
IARAI Authors
Rahul Siripurapu, Luis Ferro, Dr Michael Kopp​, Dr Sepp Hochreiter
Research
Reinforcement Learning
Keywords
Curriculum Learning, Deep Learning, Knowledge Distillation, Markov Decision Process

©2023 IARAI - INSTITUTE OF ADVANCED RESEARCH IN ARTIFICIAL INTELLIGENCE

Imprint | Privacy Policy

Stay in the know with developments at IARAI

We can let you know if there’s any

updates from the Institute.
You can later also tailor your news feed to specific research areas or keywords (Privacy)
Loading

Log in with your credentials

Forgot your details?

Create Account