Accepted Journal Papers
-
Stable-Baselines3: Reliable Reinforcement Learning Implementations
Antonin Raffin, Ashley Hill, Adam Gleave, Anssi Kanervisto, Maximilian Ernestus, Noah Dormann
-
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Shangtong Zhang, Remi Tachet des Combes, Romain Laroche
-
Cheap and Deterministic Inference for Deep State-Space Models of Interacting Dynamical Systems
Andreas Look, Barbara Rakitsch, Melih Kandemir, Jan Peters
-
Models of human preference for learning reward functions
W. Bradley Knox, Stephane Hatgis-Kessell, Serena Booth, Scott Niekum, Peter Stone, Alessandro G Allievi
-
Reward (Mis)design for autonomous driving☆
W. Bradley Knox, Alessandro Allievi, Holger Banzhaf, Felix Schmitt, Peter Stone
-
Structure in Deep Reinforcement Learning: A Survey and Open Problems
Aditya Mohan, Amy Zhang, Marius Lindauer
-
Emergent behaviour and neural dynamics in artificial agents tracking odour plumes
Satpreet H. Singh, Floris van Breugel, Rajesh P. N. Rao, Bingni W. Brunton
-
GVFs in the real world: making predictions online for water treatment
Muhammad Kamran Janjua, Haseeb Shah, Martha White, Erfan Miahi, Marlos C. Machado, Adam White
-
Off-Policy Actor-Critic with Emphatic Weightings
Eric Graves, Ehsan Imani, Raksha Kumaraswamy, Martha White
-
Risk Sensitive Dead-end Identification in Safety-Critical Offline Reinforcement Learning
Taylor W. Killian, Sonali Parbhoo, Marzyeh Ghassemi
-
Temporal Difference Learning with Compressed Updates: Error-Feedback meets Reinforcement Learning
Aritra Mitra, George J. Pappas, Hamed Hassani
-
Action Noise in Off-Policy Deep Reinforcement Learning: Impact on Exploration and Performance
Jakob Hollenstein, Sayantan Auddy, Matteo Saveriano, Erwan Renaudo, Justus Piater
-
Granger Causal Interaction Skill Chains
Caleb Chuck, Kevin Black, Aditya Arjun, Yuke Zhu, Scott Niekum
-
On Uncertainty in Deep State Space Models for Model-Based Reinforcement Learning
Philipp Becker, Gerhard Neumann
-
Mitigating Value Hallucination in Dyna-Style Planning via Multistep Predecessor Models
Farzane Aminmansour, Taher Jafferjee, Ehsan Imani, Erin J. Talvitie, Michael Bowling, Martha White
-
Investigating the properties of neural network representations in reinforcement learning
Han Wang, Erfan Miahi, Martha White, Marlos C. Machado, Zaheer Abbas, Raksha Kumaraswamy, Vincent Liu, Adam White