Reinforcement Learning
2023
T. Schmied, M. Hofmarcher, F. Paischer, R. Pascanu, and S. Hochreiter (2023) Learning to Modulate Pre-trained Models in RL. arXiv:2306.14884, 2023-06-26. (more) (download)
F. Paischer, T. Adler, M. Hofmarcher, and S. Hochreiter (2023) Semantic HELM: An Interpretable Memory for Reinforcement Learning. arXiv:2306.09312, 2023-06-15. (more) (download)
2022
C. Steinparz, T. Schmied, F. Paischer, M.-C. Dinu, V. Patil, A. Bitto-Nemling, H. Eghbal-zadeh, and S. Hochreiter (2022) Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement Learning. Proceedings of The 1st Conference on Lifelong Learning Agents, PMLR, 199, 441-469, 2022, 2022-11-28. (more) (download)
F. Paischer, T. Adler, V. Patil, A. Bitto-Nemling, M. Holzleitner, S. Lehner, H. Eghbal-zadeh, and S. Hochreiter (2022) History Compression via Language Models in Reinforcement Learning. Proceedings of the 39th International Conference on Machine Learning, PMLR, 162, 17156-17185, 2022-06-28. (more) (download)
L. Servadei, J. H. Lee, J. A. A. Medina, M. Werner, S. Hochreiter, W. Ecker, and R. Wille (2022) Deep Reinforcement Learning for Optimization at Early Design Stages. IEEE Design & Test, 2022-01-20. (more) (download)
2020
M. Holzleitner, L. Gruber, J. Arjona-Medina, J. Brandstetter, and S. Hochreiter (2020) Convergence Proof for Actor-Critic Methods Applied to PPO and RUDDER. arXiv:2012.01399, 2020-12-02. (more) (download)
L. Servadei, J. Zheng, J. Arjona-Medina, M. Werner, V. Esen, S. Hochreiter, W. Ecker, and R. Wille (2020) Cost Optimization at Early Stages of Design Using Deep Reinforcement Learning. Proceedings of the 2020 ACM/IEEE Workshop on Machine Learning for CAD, 37-42, 2020-11-16. (more) (download)
V. P. Patil, M. Hofmarcher, M.-C. Dinu, M. Dorfer, P. M. Blies, J. Brandstetter, J. A. Arjona-Medina, and S. Hochreiter (2020) Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution. arXiv:2009.14108, 2020-09-29. (more) (download)
2019
J. A. Arjona-Medina, M. Gillhofer, M. Widrich, T. Unterthiner, J. Brandstetter, and S. Hochreiter (2019) RUDDER – Return Decomposition with Delayed Rewards. Advances in Neural Information Processing Systems 32 (NeurIPS 2019), 13566; e-print also at arXiv:1806.07857v3, 2019-09-10. (more) (download)