EWRL15 (2022)
The 15th European Workshop on Reinforcement Learning (EWRL 2022)
Dates: 19-21 September 2022
Location: Aula De Carli – Politecnico di Milano – Campus Bovisa, Building B9
Via Durando, 10 – 20158 – Milano (MI) – Italy
There are many entrances to the campus, we suggest to use the entrance in via Durando 10 to reach the venue easier
Schedule (add to Google Calendar)
Monday- 19/09/2022
8:30 – 9:30 Check-in
9:30 – 10:30 Tutorial 1 (part 1) Matteo Pirotta: “Exploration in Reinforcement Learning”
10:30 – 11:00 coffee break
11:00 – 12:00 Tutorial 1 (part 2)
12:00 – 13:00 Sponsor Talks 1
13:00 – 14:30 Lunch break
14:30 – 15:30 Tutorial 2 (part 1) Matthieu Geist: “Regularization in Reinforcement Learning”
15:30 – 16:00 coffee break
16:00 – 17:00 Tutorial 2 (part 2)
17:00 – 18:00 Sponsor Talks 2
18:00 – 20:00 Welcome reception
Tuesday- 20/09/2022
8:00 – 9:00 Check-in
8.45 – 9.00 Opening remarks
9:00 – 9:40 Invited talk 1 Sarah Perrin: “Scaling up MARL with MFGs and vice versa!”
9:40 – 10:00 Contributed talk 1 (Scalable Deep Reinforcement Learning Algorithms for Mean Field Games)
10:00 – 11:00 Poster session 1 (with Coffee break)
11:00 – 11:40 Invited talk 2 Niao He: “Complexities of Actor-critic Methods for Regularized MDPs and POMDPs”
11:40– 12:00 Contributed talk 2 (IQ-Learn: Inverse soft-Q Learning for Imitation)
12:00 – 12:20 Contributed talk 3 (Newton-based Policy Search for Networked Multi-agent Reinforcement Learning)
12:20 – 14:00 Lunch break
14:00 – 14:40 Invited talk 3 Ann Nowé: “Beyond the optimal action in Reinforcement Learning”
14:40 – 15:00 Contributed talk 4 (Group Fairness in Reinforcement Learning)
15:00 – 15:20 Contributed talk 5 (Direct Advantage Estimation)
15:20 – 16:00 Invited talk 4 Jan Peters: “Robot RL: Lessons from the Physical World”
16:00 – 18:00 Poster session 2 (with Coffee break)
20:00 Social Dinner
Wednesday- 21/09/2022
8:00 – 9:00 Check-in
9:00 – 9:40 Invited talk 1 Alessandro Lazaric: Understanding (unsupervised) exploration in goal-based Reinforcement Learning
9:40 – 10:00 Contributed talk 1 (Optimistic PAC Reinforcement Learning: the Instance-Dependent View)
10:00 – 11:00 Poster session 1 (with Coffee break)
11:00– 11:40 Invited talk 2 Ciara Pike-Burke: “Multi-armed bandits with history dependent rewards”
11:40 – 12:00 Contributed talk 2 (A Last Switch Dependent Analysis of Satiation and Seasonality in Bandits)
12:00 – 12:20 Contributed talk 3 (Dynamic Pricing with Online Data Aggregation and Learning)
12:20 – 14:00 Lunch break
14:00 – 14:40 Invited talk 3 Gergely Neu: “Primal-Dual Methods for Reinforcement Learning”
14:40 – 15:00 Contributed talk 4 (Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality)
15:00 – 15:20 Contributed talk 5 (Local Feature Swapping for Generalization in Reinforcement Learning)
15:20 – 16:00 Invited talk 4 Richard Sutton: “An Architecture for Intelligence”
16:00 – 18:00 Poster session 2 (with Coffee break)
Poster Session Assignment
Each poster is assigned a day ( either September 20 or September 21) and will be presented in both (morning and afternoon) poster sessions of that day
Poster Session 20 September
Scalable Representation Learning in Linear Contextual Bandits with Constant Regret Guarantees |
Curriculum Reinforcement Learning via Constrained Optimal Transport |
Multi-Objective Coordination Graphs for the Expected Scalarised Returns with Generative Flow Models |
Rate-Optimal Online Convex Optimization in Adaptive Linear Control |
Mixture of Interpretable Experts for Continuous Control |
Adaptive Belief Discretization for POMDP Planning |
IQ-Learn: Inverse soft-Q Learning for Imitation |
On Bayesian Value Function Distributions. |
Minimax-Bayes Reinforcement Learning |
Formulation and validation of a complete car-following model based on deep reinforcement learning |
General Policy Evaluation and Improvement by Learning to Identify Few But Crucial States |
A Deep Reinforcement Learning Approach to Supply Chain Inventory Management |
On learning history-based policies for controlling Markov Decision Processes |
Belief states of POMDPs and internal states of recurrent RL agents: an empirical analysis of their mutual information |
Get Back Here: Robust Imitation by Return-to-Distribution Planning |
Semi-Counterfactual Risk Minimization Via Neural Networks |
Dynamic Pricing with Online Data Aggregation and Learning |
Newton-based Policy Search for Networked Multi-agent Reinforcement Learning |
A Learning Based Framework for Handling Uncertain Lead Times in Multi-Product Inventory Management |
Group Fairness in Reinforcement Learning |
Cross-Entropy Soft-Risk Reinforcement Learning |
Upside-Down Reinforcement Learning Can Diverge in Stochastic Environments With Episodic Resets |
$Q$-Learning for $L_p$ Robust Markov Decision Processes |
Learning Efficiently Function Approximation for Contextual MDP |
Risk-aware linear bandits with convex loss |
Local Advantage Networks for Multi-Agent Reinforcement Learning in Dec-POMDPs |
Bilinear Exponential Family of MDPs: Frequentist Regret Bound with Tractable Exploration \& Planning |
RLDesigner: Toward Framing Spatial Layout Planning as a Markov Decision Process |
Optimistic Risk-Aware Model-based Reinforcement Learning |
Quantification of Transfer in Reinforcement Learning via Regret Bounds for Learning Agents |
Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies |
Scalable Deep Reinforcement Learning Algorithms for Mean Field Games |
Cooperative Online Learning in Stochastic and Adversarial MDPs |
Interactive Inverse Reinforcement Learning |
A Unifying Framework for Reinforcement Learning and Planning |
Neural Distillation as a State Representation Bottleneck in Reinforcement Learning |
Poster Session 21 September
When Privacy Meets Partial Information: A Refined Analysis of Differentially Private Bandits |
Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act |
Near Instance-Optimal PAC Reinforcement Learning for Deterministic MDPs |
Optimistic PAC Reinforcement Learning: the Instance-Dependent View |
Active Exploration for Inverse Reinforcement Learning |
In a Nutshell, the Human Asked for This: Latent Goals for Following Temporal Specifications |
Curious Exploration via Structured World Models Yields Zero-Shot Object Manipulation |
A Near-Optimal Best-of-Both-Worlds Algorithm for Online Learning with Feedback Graphs |
Boosting reinforcement learning with sparse and rare rewards using Fleming-Viot particle systems |
Look where you look! Saliency-guided Q-networks for visual RL tasks |
Local Feature Swapping for Generalization in Reinforcement Learning |
On Convergence of Neural asynchronous Q-iteration |
On Reward Binarisation and Bayesian Agents |
Goal-Conditioned Generators of Deep Policies |
Tabular and Deep Learning of Whittle Index |
Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularization |
A Last Switch Dependent Analysis of Satiation and Seasonality in Bandits |
Direct Advantage Estimation |
Offline Credit Assignment in Deep Reinforcement Learning with Hindsight Discriminator Networks |
Continuous Control with Action Quantization from Demonstrations |
SFP: State-free Priors for Exploration in Off-Policy Reinforcement Learning |
Learning Generative Models with Goal-conditioned Reinforcement Learning |
Analyzing Thompson Sampling for Contextual Bandits via the Lifted Information Ratio |
A Sparse Linear Program for Global Planning in Large MDPs |
Optimism in Face of a Context: Regret Guarantees for Stochastic Contextual MDP |
Sample-Efficient Reinforcement Learning of Partially Observable Markov Games |
Entropy Regularized Reinforcement Learning with Cascading Networks |
Regret Bounds for Satisficing in Multi-Armed Bandit Problems |
A Best-of-Both-Worlds Algorithm for Bandits with Delayed Feedback |
Analysis of Stochastic Processes through Replay Buffers |
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback |
Reinforcement Learning with a Terminator |
Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality |
Stochastic Bandits with Vector Losses: Minimizing $\ell^\infty$-Norm of Relative Losses |
Deep Coherent Exploration for Continuous Control |
First Go, then Post-Explore: the Benefits of Post-Exploration in Intrinsic Motivation |
Registration
Registrations for the 15th European Workshop on Reinforcement Learning are now open! The registration includes participation in the main event activities, as well as lunch all days of the event and a social dinner on September 20th. The early bird registration period ends on July 31st August 5th. Thanks to the generosity of our sponsors, we will be able to offer to students a limited number of participation grants, in the form of fee waivers. We will offer the grants based on merit and D&I considerations. If you think you come from an underrepresented group or have financial needs, please consider applying for the grant. The grant application deadline is July 21st. The grant notification will be given by July 28th, to allow students not receiving the grant to complete the payment for the early bird registration.
Description
The 15th European workshop on reinforcement learning (EWRL 2022) invites reinforcement learning researchers to participate in the revival of this world class event. We plan to make this an exciting event for researchers worldwide, not only for the presentation of top quality papers, but also as a forum for ample discussion of open problems and future research directions.
Reinforcement learning is an active field of research which deals with the problem of sequential decision making in unknown (and often) stochastic and/or partially observable environments. Recently there has been a wealth of both impressive empirical results, as well as significant theoretical advances. Both types of advances are of significant importance and we would like to create a forum to discuss such interesting results.
The workshop will cover a range of sub-topics including (but not limited to):
- MDPs and Dynamic Programming
- Temporal Difference Methods
- Policy Optimization
- Model-based RL and Planning
- Exploration in RL
- Offline RL
- Unsupervised and Intrinsically Motivated RL
- Representation Learning in RL
- Lifelong and Non-stationary RL
- Hierarchical RL
- Partially observable RL
- Multi-agent RL
- Multi-objective RL
- Transfer and Meta RL
- Deep RL
- Imitation Learning and Inverse RL
- Risk-sensitive and robust RL
- Theoretical aspects of RL
- Applications and Real-life RL
Paper Submission
We invite submissions for the 15th European Workshop on Reinforcement Learning (EWRL 2022) from the entire reinforcement learning spectrum. The papers can present new work or give a summary of recent work of the author(s). There will be no proceedings of EWRL15. As such, papers that are intended for or have been submitted to other conferences or journals are also welcome. Submitted papers will be reviewed by the program committee in a double-blind procedure.
Submissions should follow the JMLR format adapted for EWRL linked below. There is a limit of 9 pages, excluding acknowledgments, references, and appendix. Authors of accepted papers will be allowed an additional page to prepare the camera-ready version. All accepted papers will be considered for the poster sessions. Outstanding papers will also be considered for a 20 minutes oral presentation.
Please send your inquiries by email to the organizers at ewrl2022@gmail.com.
- Submission deadline:
1 June 20228 June 2022 11.59pm AOE - Page limit: 9 pages excluding acknowledgments, references, and appendix
- Paper format: EWRL 2022 Author Kit
- Paper Submissions: CMT
Important Dates
- Paper submissions due:
1 June 20228 June 2022 11.59pm AOE - Early Registration begins: 1 July 2022
- Participation grant application begins: 1 July 2022
- Paper notification: 14 July 2022
- Participation grant application ends: 21 July 2022
- Participation grant notification: 28 July 2022
- Early registration ends:
31 July5 August 2022 - Camera ready due: 1 September 2022
- Workshop begins: 19 September 2022
- Workshop ends: 21 September 2022
Confirmed Invited Speakers
- Sarah Perrin (Inria Lille)
- Topic: Scaling up MARL with MFGs and vice versa!
- Niao He (ETH Zurich)
- Topic: Complexities of Actor-critic Methods for Regularized MDPs and POMDPs
- Alessandro Lazaric (Facebook AI Research)
- Topic: Understanding (unsupervised) exploration in goal-based RL
- Gergely Neu (Universitat Pompeu Fabra)
- Topic: Primal-Dual Methods for Reinforcement Learning
- Ann Nowé (Vrije Universiteit Brussel)
- Topic: Beyond the optimal action in RL
- Jan Peters (Technische Universität Darmstadt)
- Topic: Robot RL: Lessons from the Physical World
- Ciara Pike-Burke (Imperial College London)
- Topic: Multi-armed bandits with history dependent rewards
- Richard Sutton (University of Alberta – DeepMind)
- Topic: An Architecture for Intelligence
Confirmed Tutorial Sessions
- Matthieu Geist (Google Research)
- Topic: Regularization in Reinforcement Learning
- Matteo Pirotta (Facebook AI Research)
- Topic: Exploration in Reinforcement Learning
Organizing Committee
General Chair
- Marcello Restelli (Politecnico di Milano – Milan, Italy)
Organizing Chair
- Francesco Trovò (Politecnico di Milano – Milan, Italy)
Program Chair
- Alberto Maria Metelli (Politecnico di Milano – Milan, Italy)
Program Co-Chairs
- Mirco Mutti (Universita di Bologna- Bologna, Politecnico di Milano – Milan, Italy)
- Pierre Liotet (Politecnico di Milano – Milan, Italy)
Diversity and Inclusion Chairs
- Giorgia Ramponi (ETH AI Center)
- Riccardo Zamboni (Politecnico di Milano – Milan, Italy)
Workflow Chairs
- Lorenzo Bisi (Politecnico di Milano – Milan, Italy)
- Luca Sabbioni (Politecnico di Milano – Milan, Italy)
Communication Chairs
- Amarildo Likmeta (Universita di Bologna- Bologna, Politecnico di Milano – Milan, Italy)
- Marco Mussi (Politecnico di Milano – Milan, Italy)
External Organizers


Sponsorship Program
EWRL 2022 invites companies or research institutions involved in fundamental research or application of reinforcement learning, to become official sponsors of the event. EWRL 2022 offers a single level of sponsorship, at the cost of 5000€, with the following benefits:
- Logo display on the official EWRL 2022 website
- Logo display on the Welcome Kit distributed during the event
- A poster session slot for presenting your research or applications
- Access to the EWRL recruitment database
- Two full-access registrations to the event
Workshop Venue

EWRL2022 takes place in Milan, Italy. The precise address is:
Aula De Carli – Politecnico di Milano – Campus Bovisa
Via Candiani, 72 – 20158 – Milano (MI) – Italy
Reaching the Venue
Milan is very easy to travel to by car, train or airplane. The easiest way to reach Milan is by train, with many daily trains reaching the stations of Milano Centrale, Milano Porta Garibaldi or Milano Cadorna. By airplane, the best airports for reaching the workshop venue are Milano Malpensa or Milano Linate. You can also come via the Orio al Serio (Bergamo) Airport. After having reached Milan by train or airplane, to reach the Workshop venue you can chose the following options:
- If you are in Milano Malpensa Airport, you can take the Malpensa Express Train, directly from the airport every 30 mins. The train will have final destination either Milano Cadorna, or Milano Centrale but in both cases will stop in the station of Milano Bovisa Politecnico, where the workshop will take place. So we suggest not to take the train to the final destination, but rather stop directly in Milano Bovisa Politecnico. Milan can also be reached by bus departing from Malpensa Airport. In this case you will reach Milano Centrale train station (in around 1 hour). From Milano Centrale, you can take every train passing from local train lines S1, S2 or S13.
- If you reach Milan in Milano Linate Airport, you will first need to reach a train station either by Taxi or by bus number 73. The easiest station to reach is Milano centrale by taking bus 73 at the airport and then switching to bus 91. Once you reach a train station, you can take every train passing from the local train lines S1, S2 or S13, as they will stop in Milano Bovisa Politecnico train station. These buses and trains are easily accessible with a regular single use ATM metro ticket.
- If you reach Milan in the Orio al Serio (Bergamo) Airport, sadly no railway connection to Milan is available. Nevertheless, you can take a Taxi or better yet a bus from the airport directly to Milano Centrale. The bus can be taken directly at the airport exit, it is available every 20-30 minutes and it reaches Milano Centrale Station in 50-60 minutes. From Milano Centrale, you can take every train passing from local train lines S1, S2 or S13 to reach the Milano Bovisa Politecnico train station.
- If you reach Milano by train and you do not pass from Milano Bovisa Politecnico train station before reaching your final destination, the easiest way to the workshop venue is to take any train passing from local train lines S1, S2 or S13.
Program Committee
Aditya Modi |
Ahmed Touati |
Alain Dutech |
Aldo Pacchiano |
Alessandro Lazaric |
Alessio Russo |
Alexis Jacq |
Amarildo Likmeta |
André Biedenkapp |
Andrea Tirinzoni |
Boris Belousov |
Brendan O’Donoghue |
Carlo D’Eramo |
Christos Dimitrakakis |
Ciara Pike-Burke |
Claire Vernade |
Conor F Hayes |
David Abel |
David Brandfonbrener |
David Meger |
Davide Tateo |
Debabrota Basu |
Divya Grover |
Dongruo Zhou |
Dylan R Ashley |
Elena Smirnova |
Emilie Kaufmann |
Emmanuel Esposito |
Eugenio Bargiacchi |
Felipe Leno da Silva |
Felix Berkenkamp |
Francesco Faccio |
Fredrik Heintz |
Gergely Neu |
Germano Gabbianelli |
Gianluca Drappo |
Giorgia Ramponi |
Giorgio Manganini |
Giuseppe Canonaco |
Glen Berseth |
Hannes Eriksson |
Hao Liu |
Harsh Satija |
Hélène Plisnier |
Ido Greenberg |
Jens Kober |
Johan Källström |
Jonathan J Hunt |
Julien Perolat |
Kamyar Azizzadenesheli |
Khaled Eldowa |
Khazatsky Alexander |
Khimya Khetarpal |
Kianté Brantley |
Léonard Hussenot |
Lior Shani |
Martin Klissarov |
Martino Bernasconi |
Mathieu Reymond |
Matteo Papini |
Matteo Pirotta |
Matthew E. Taylor |
Matthieu Geist |
Nico Montali |
Nicolò A Cesa-Bianchi |
Olivier Bachem |
Omar Darwiche Domingues |
Paolo Bonetti |
Patrick Mannion |
Patrick Saux |
Peter Vamplew |
Philippe Preux |
Pierluca D’Oro |
Pierre Liotet |
Pierre Menard |
Prashanth L.A. |
Puze Liu |
Quanquan Gu |
Rafael Rodriguez Sanchez |
Rahul Savani |
Riad Akrour |
Riccardo Poiani |
Richard S Sutton |
Robert Dadashi |
Roberta Raileanu |
Romina Abachi |
Ronald Ortner |
Roxana Radulescu |
Rui YUAN |
Samuele Tosatto |
Shangdong Yang |
Simon Du |
Tal Lancewicki |
Taylor W Killian |
Tengyang Xie |
Thanh Nguyen-Tang |
Tian Xu |
Tianwei Ni |
Tom Schaul |
Tom Zahavy |
Tommaso R Cesari |
Weitong ZHANG |
Yannis Flet-Berliac |
Yi Su |
Yishay Mansour |
Younggyo Seo |
Accepted Papers
Direct Advantage Estimation Pan, Hsiao-Ru*; Gürtler, Nico; Neitz, Alexander; Schölkopf, Bernhard Accept (Oral) |
Newton-based Policy Search for Networked Multi-agent Reinforcement Learning Manganini, Giorgio*; Fioravanti, Simone; Ramponi, Giorgia Accept (Oral) |
A Last Switch Dependent Analysis of Satiation and Seasonality in Bandits Laforgue, Pierre; Clerici, Giulia*; Cesa-Bianchi, Nicolò; Gilad-Bachrach, Ran Accept (Oral) |
Local Feature Swapping for Generalization in Reinforcement Learning Bertoin, David*; Rachelson, Emmanuel Accept (Oral) |
Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality Zahavy, Tom*; Schroecker, Yannick; Behbahani, Feryal; Baumli, Kate; Flennerhag, Sebastian; Hou, Shaobo; Singh, Satinder Accept (Oral) |
Dynamic Pricing with Online Data Aggregation and Learning Genalti, Gianmarco*; Mussi, Marco; Nuara, Alessandro; Gatti, Nicola Accept (Oral) |
Scalable Deep Reinforcement Learning Algorithms for Mean Field Games Lauriere, Mathieu; Perrin, Sarah*; Girgin, Sertan; Muller, Paul; Jain, Ayush; Cabannes, Théophile; Piliouras, Georgios; Perolat, Julien; Élie, Romuald; Pietquin, Olivier; Geist, Matthieu Accept (Oral) |
Optimistic PAC Reinforcement Learning: the Instance-Dependent View Tirinzoni, Andrea*; Al Marjani, Aymen; Kaufmann, Emilie Accept (Oral) |
Group Fairness in Reinforcement Learning Satija, Harsh*; Lazaric, Alessandro; Pirotta, Matteo; Pineau, Joelle Accept (Oral) |
IQ-Learn: Inverse soft-Q Learning for Imitation Garg, Divyansh*; Chakraborty, Shuvam; Cundy, Chris; Song, Jiaming; Geist, Matthieu; Ermon, Stefano Accept (Oral) |
In a Nutshell, the Human Asked for This: Latent Goals for Following Temporal Specifications G. León, Borja*; Shanahan, Murray; Belardinelli, Francesco Accept (Poster) |
A Deep Reinforcement Learning Approach to Supply Chain Inventory Management Stranieri, Francesco*; Stella, Fabio Accept (Poster) |
Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act Jacq, Alexis*; Ferret, Johan; Geist, Matthieu; Pietquin, Olivier Accept (Poster) |
Continuous Control with Action Quantization from Demonstrations Dadashi, Robert; Hussenot, Léonard*; Vincent, Damien; Girgin, Sertan; Raichuk, Anton; Geist, Matthieu; Pietquin, Olivier Accept (Poster) |
A Unifying Framework for Reinforcement Learning and Planning Moerland, Thomas M*; Broekens, Joost; Plaat, Aske; Jonker, Catholijn M Accept (Poster) |
Upside-Down Reinforcement Learning Can Diverge in Stochastic Environments With Episodic Resets Strupl, Miroslav*; Faccio, Francesco; Ashley, Dylan R; Schmidhuber, Jürgen ; Srivastava, Rupesh Kumar Accept (Poster) |
Stochastic Bandits with Vector Losses: Minimizing $\ell^\infty$-Norm of Relative Losses Shang, Xuedong*; Shao, Han; Qian, Jian Accept (Poster) |
Semi-Counterfactual Risk Minimization Via Neural Networks Aminian, Gholamali*; Vega, Roberto I; Rivasplata, Omar; Toni, Laura; Rodrigues, Miguel Accept (Poster) |
When Privacy Meets Partial Information: A Refined Analysis of Differentially Private Bandits Azize, Achraf*; Basu, Debabrota Accept (Poster) |
Deep Coherent Exploration for Continuous Control Zhang, Yijie*; van Hoof, Herke Accept (Poster) |
Bilinear Exponential Family of MDPs: Frequentist Regret Bound with Tractable Exploration \& Planning Ouhamma, Reda*; Basu, Debabrota; Maillard, Odalric Accept (Poster) |
Neural Distillation as a State Representation Bottleneck in Reinforcement Learning Guillet, Valentin*; Wilson, Dennis; Aguilar-Melchor, Carlos; Rachelson, Emmanuel Accept (Poster) |
Tabular and Deep Learning of Whittle Index Robledo, Francisco*; Ayesta, Urtzi; Avrachenkov, Konstantin; Borkar, Vivek Accept (Poster) |
Learning Efficiently Function Approximation for Contextual MDP Levy, Orin*; Mansour, Yishay Accept (Poster) |
Optimism in Face of a Context: Regret Guarantees for Stochastic Contextual MDP Levy, Orin*; Mansour, Yishay Accept (Poster) |
Look where you look! Saliency-guided Q-networks for visual RL tasks Bertoin, David*; Zouitine, Adil; Zouitine, Mehdi; Rachelson, Emmanuel Accept (Poster) |
Quantification of Transfer in Reinforcement Learning via Regret Bounds for Learning Agents Tuynman, Adrienne; Ortner, Ronald* Accept (Poster) |
Regret Bounds for Satisficing in Multi-Armed Bandit Problems Michel, Thomas; Hajiabolhassan, Hossein; Ortner, Ronald* Accept (Poster) |
Risk-aware linear bandits with convex loss Saux, Patrick*; Maillard, Odalric Accept (Poster) |
Interactive Inverse Reinforcement Learning Kleine Büning, Thomas*; George, Anne-Marie; Dimitrakakis, Christos Accept (Poster) |
Reinforcement Learning with a Terminator Guy, Tennenholtz*; Merlis, Nadav; Shani, Lior; Mannor, Shie; Shalit, Uri; Chechik, Gal; Hallak, Assaf; Dalal, Gal Accept (Poster) |
On Convergence of Neural asynchronous Q-iteration Smirnova, Elena* Accept (Poster) |
Curriculum Reinforcement Learning via Constrained Optimal Transport Klink, Pascal; Yang, Haoyi; D’Eramo, Carlo*; Peters, Jan; Pajarinen, Joni Accept (Poster) |
Cross-Entropy Soft-Risk Reinforcement Learning Greenberg, Ido*; Chow, Yinlam; Ghavamzadeh, Mohammad; Mannor, Shie Accept (Poster) |
Active Exploration for Inverse Reinforcement Learning Lindner, David*; Krause, Andreas; Ramponi, Giorgia Accept (Poster) |
Offline Credit Assignment in Deep Reinforcement Learning with Hindsight Discriminator Networks Ferret, Johan*; Pietquin, Olivier; Geist, Matthieu Accept (Poster) |
Local Advantage Networks for Multi-Agent Reinforcement Learning in Dec-POMDPs Avalos, Raphael*; Reymond, Mathieu; Nowé, Ann; Roijers, Diederik M Accept (Poster) |
Sample-Efficient Reinforcement Learning of Partially Observable Markov Games Liu, Qinghua*; Szepesvari, Csaba; Jin, Chi Accept (Poster) |
General Policy Evaluation and Improvement by Learning to Identify Few But Crucial States Faccio, Francesco*; Ramesh, Aditya; Herrmann, Vincent; Harb, Jean; Schmidhuber, Jürgen Accept (Poster) |
Goal-Conditioned Generators of Deep Policies Faccio, Francesco*; Herrmann, Vincent; Ramesh, Aditya; Kirsch, Louis; Schmidhuber, Jürgen Accept (Poster) |
A Learning Based Framework for Handling Uncertain Lead Times in Multi-Product Inventory Management Meisheri, Hardik*; Nath, Somjit; Baranwal, Mayank; Khadilkar, Harshad Accept (Poster) |
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback Jin, Tiancheng; Lancewicki, Tal*; Luo, Haipeng; Mansour, Yishay; Rosenberg, Aviv Accept (Poster) |
Rate-Optimal Online Convex Optimization in Adaptive Linear Control Cassel, Asaf B*; Cohen, Alon; Koren, Tomer Accept (Poster) |
Cooperative Online Learning in Stochastic and Adversarial MDPs Lancewicki, Tal*; Rosenberg, Aviv; Mansour, Yishay Accept (Poster) |
Multi-Objective Coordination Graphs for the Expected Scalarised Returns with Generative Flow Models Hayes, Conor F*; Verstraeten, Timothy; Roijers, Diederik M; Howley, Enda; Mannion, Patrick Accept (Poster) |
Get Back Here: Robust Imitation by Return-to-Distribution Planning Cideron, Geoffrey*; Pietquin, Olivier; Dadashi, Robert; Dulac-Arnold, Gabriel; Tabanpour, Baruch; Geist, Matthieu; Hussenot, Léonard; Curi, Sebastian; Girgin, Sertan Accept (Poster) |
Analysis of Stochastic Processes through Replay Buffers Di-Castro, Shirli*; Mannor, Shie; Di Castro, Dotan Accept (Poster) |
Mixture of Interpretable Experts for Continuous Control Tateo, Davide*; Akrour, Riad; Peters, Jan Accept (Poster) |
On Reward Binarisation and Bayesian Agents Catt, Elliot*; Hutter, Marcus; Veness, Joel Accept (Poster) |
$Q$-Learning for $L_p$ Robust Markov Decision Processes Kumar, Navdeep*; Wang, Kaixin; Levy, Kfir; Mannor, Shie Accept (Poster) |
First Go, then Post-Explore: the Benefits of Post-Exploration in Intrinsic Motivation Yang, Zhao*; Moerland, Thomas M; Preuss, Mike; Plaat, Aske Accept (Poster) |
Curious Exploration via Structured World Models Yields Zero-Shot Object Manipulation Sancaktar, Cansu *; Blaes, Sebastian; Martius, Georg Accept (Poster) |
Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies YUAN, Rui*; Gower, Robert M; Lazaric, Alessandro; Du, Simon; Xiao, Lin Accept (Poster) |
A Near-Optimal Best-of-Both-Worlds Algorithm for Online Learning with Feedback Graphs Rouyer , Chloé *; van der Hoeven, Dirk; Cesa-Bianchi, Nicolò; Seldin, Yevgeny Accept (Poster) |
RLDesigner: Toward Framing Spatial Layout Planning as a Markov Decision Process Kakooee, Reza*; Dillenburger, Benjamin Accept (Poster) |
Belief states of POMDPs and internal states of recurrent RL agents: an empirical analysis of their mutual information Lambrechts, Gaspard*; Bolland, Adrien; Ernst, Damien Accept (Poster) |
Lifting the Information Ratio: An Information-Theoretic Analysis of Thompson Sampling for Contextual Bandits Neu, Gergely; Olkhovskaya, Julia; Papini, Matteo*; Schwartz, Ludovic Accept (Poster) |
Formulation and validation of a complete car-following model based on deep reinforcement learning Hart, Fabian* Accept (Poster) |
Near Instance-Optimal PAC Reinforcement Learning for Deterministic MDPs Tirinzoni, Andrea*; Al Marjani, Aymen; Kaufmann, Emilie Accept (Poster) |
Scalable Representation Learning in Linear Contextual Bandits with Constant Regret Guarantees Tirinzoni, Andrea*; Papini, Matteo; Touati, Ahmed; Lazaric, Alessandro; Pirotta, Matteo Accept (Poster) |
Learning Generative Models with Goal-conditioned Reinforcement Learning Vargas Vieyra, Mariana*; Menard, Pierre Accept (Poster) |
Adaptive Belief Discretization for POMDP Planning Grover, Divya*; Dimitrakakis, Christos Accept (Poster) |
Boosting reinforcement learning with sparse and rare rewards using Fleming-Viot particle systems Mastropietro, Daniel G*; Majewski, Szymon; Ayesta, Urtzi; Jonckheere, Matthieu Accept (Poster) |
A Sparse Linear Program for Global Planning in Large MDPs Neu, Gergely; Okolo, Nneka M* Accept (Poster) |
A Best-of-Both-Worlds Algorithm for Bandits with Delayed Feedback Masoudian, Saeed*; Zimmert, Julian; Seldin, Yevgeny Accept (Poster) |
Entropy Regularized Reinforcement Learning with Cascading Networks Shilova, Alena; Della Vecchia, Riccardo; Preux, Philippe; Akrour, Riad* Accept (Poster) |
Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularization Patil, Gandharv*; L.A., Prashanth; Precup, Doina Accept (Poster) |
On learning history-based policies for controlling Markov Decision Processes Patil, Gandharv*; Mahajan, Aditya; Precup, Doina Accept (Poster) |
Minimax-Bayes Reinforcement Learning Kleine Büning, Thomas; Dimitrakakis, Christos; Eriksson, Hannes; Grover, Divya; Jorge, Emilio* Accept (Poster) |
On Bayesian Value Function Distributions. Jorge, Emilio; Eriksson, Hannes*; Dimitrakakis, Christos; Basu, Debabrota; Grover, Divya Accept (Poster) |
TempRL: Temporal Priors for Exploration in Off-Policy Reinforcement Learning Bagatella, Marco*; Christen, Sammy; Hilliges, Otmar Accept (Poster) |
Optimistic Risk-Aware Model-based Reinforcement Learning Abachi, Romina*; Farahmand, Amir-massoud Accept (Poster) |
Photos from the Workshop







Sponsors
Code of Conduct
The official EWRL 2022 Code of Conduct can be found here
You must be logged in to post a comment.