T. Schmied, M. Hofmarcher, F. Paischer, R. Pascanu, and S. Hochreiter (2023) Learning to Modulate Pre-trained Models in RL. arXiv:2306.14884, 2023-06-26. (more) (download)

F. Paischer, T. Adler, M. Hofmarcher, and S. Hochreiter (2023) Semantic HELM: An Interpretable Memory for Reinforcement Learning. arXiv:2306.09312, 2023-06-15. (more) (download)


C. Steinparz, T. Schmied, F. Paischer, M.-C. Dinu, V. Patil, A. Bitto-Nemling, H. Eghbal-zadeh, and S. Hochreiter (2022) Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement Learning. Proceedings of The 1st Conference on Lifelong Learning Agents, PMLR, 199, 441-469, 2022, 2022-11-28. (more) (download)

F. Paischer, T. Adler, V. Patil, A. Bitto-Nemling, M. Holzleitner, S. Lehner, H. Eghbal-zadeh, and S. Hochreiter (2022) History Compression via Language Models in Reinforcement Learning. Proceedings of the 39th International Conference on Machine Learning, PMLR, 162, 17156-17185, 2022-06-28. (more) (download)

L. Servadei, J. H. Lee, J. A. A. Medina, M. Werner, S. Hochreiter, W. Ecker, and R. Wille (2022) Deep Reinforcement Learning for Optimization at Early Design Stages. IEEE Design & Test, 2022-01-20. (more) (download)


M. Holzleitner, L. Gruber, J. Arjona-Medina, J. Brandstetter, and S. Hochreiter (2020) Convergence Proof for Actor-Critic Methods Applied to PPO and RUDDER. arXiv:2012.01399, 2020-12-02. (more) (download)

L. Servadei, J. Zheng, J. Arjona-Medina, M. Werner, V. Esen, S. Hochreiter, W. Ecker, and R. Wille (2020) Cost Optimization at Early Stages of Design Using Deep Reinforcement Learning. Proceedings of the 2020 ACM/IEEE Workshop on Machine Learning for CAD, 37-42, 2020-11-16. (more) (download)

V. P. Patil, M. Hofmarcher, M.-C. Dinu, M. Dorfer, P. M. Blies, J. Brandstetter, J. A. Arjona-Medina, and S. Hochreiter (2020) Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution. arXiv:2009.14108, 2020-09-29. (more) (download)


J. A. Arjona-Medina, M. Gillhofer, M. Widrich, T. Unterthiner, J. Brandstetter, and S. Hochreiter (2019) RUDDER – Return Decomposition with Delayed Rewards. Advances in Neural Information Processing Systems 32 (NeurIPS 2019), 13566;  e-print also at arXiv:1806.07857v3, 2019-09-10. (more) (download)


Imprint | Privacy Policy

Stay in the know with developments at IARAI

We can let you know if there’s any

updates from the Institute.
You can later also tailor your news feed to specific research areas or keywords (Privacy)

Log in with your credentials

Forgot your details?

Create Account