EWRL2022

The 15th European Workshop on Reinforcement Learning (EWRL 2022)

Dates:    19-21 September 2022
Location:  Aula De Carli – Politecnico di Milano – Campus Bovisa, Building B9

Via Durando, 10 – 20158 – Milano (MI) – Italy

There are many entrances to the campus, we suggest to use the entrance in via Durando 10 to reach the venue easier

Schedule (add to Google Calendar)

Monday- 19/09/2022

8:30 – 9:30 Check-in

9:30 – 10:30 Tutorial 1 (part 1) Matteo Pirotta: “Exploration in Reinforcement Learning”

10:30 – 11:00 coffee break

11:00 – 12:00 Tutorial 1 (part 2)

12:00 – 13:00 Sponsor Talks 1

13:00 – 14:30 Lunch break

14:30 – 15:30 Tutorial 2 (part 1) Matthieu Geist: “Regularization in Reinforcement Learning”

15:30 – 16:00 coffee break

16:00 – 17:00 Tutorial 2 (part 2)

17:00 – 18:00 Sponsor Talks 2

18:00 – 20:00 Welcome reception

Tuesday- 20/09/2022

8:00 – 9:00 Check-in

8.45 – 9.00 Opening remarks

9:00 – 9:40 Invited talk 1 Sarah Perrin: “Scaling up MARL with MFGs and vice versa!”

9:40 – 10:00 Contributed talk 1 (Scalable Deep Reinforcement Learning Algorithms for Mean Field Games)

10:00 – 11:00 Poster session 1 (with Coffee break)

11:00 – 11:40 Invited talk 2 Niao He: “Complexities of Actor-critic Methods for Regularized MDPs and POMDPs”

11:40– 12:00 Contributed talk 2 (IQ-Learn: Inverse soft-Q Learning for Imitation)

12:00 – 12:20 Contributed talk 3 (Newton-based Policy Search for Networked Multi-agent Reinforcement Learning)

12:20 – 14:00 Lunch break

14:00 – 14:40 Invited talk 3 Ann Nowé: “Beyond the optimal action in Reinforcement Learning”

14:40 – 15:00 Contributed talk 4 (Group Fairness in Reinforcement Learning)

15:00 – 15:20 Contributed talk 5 (Direct Advantage Estimation)

15:20 – 16:00 Invited talk 4 Jan Peters: “Robot RL: Lessons from the Physical World”

16:00 – 18:00 Poster session 2 (with Coffee break)

20:00 Social Dinner

Wednesday- 21/09/2022

8:00 – 9:00 Check-in

9:00 – 9:40 Invited talk 1 Alessandro Lazaric: Understanding (unsupervised) exploration in goal-based Reinforcement Learning

9:40 – 10:00 Contributed talk 1 (Optimistic PAC Reinforcement Learning: the Instance-Dependent View)

10:00 – 11:00 Poster session 1 (with Coffee break)

11:00– 11:40 Invited talk 2 Ciara Pike-Burke: “Multi-armed bandits with history dependent rewards”

11:40 – 12:00 Contributed talk 2 (A Last Switch Dependent Analysis of Satiation and Seasonality in Bandits)

12:00 – 12:20 Contributed talk 3 (Dynamic Pricing with Online Data Aggregation and Learning)

12:20 – 14:00 Lunch break

14:00 – 14:40 Invited talk 3 Gergely Neu: “Primal-Dual Methods for Reinforcement Learning”

14:40 – 15:00 Contributed talk 4 (Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality)

15:00 – 15:20 Contributed talk 5 (Local Feature Swapping for Generalization in Reinforcement Learning)

15:20 – 16:00 Invited talk 4 Richard Sutton: “An Architecture for Intelligence”

16:00 – 18:00 Poster session 2 (with Coffee break)

Poster Session Assignment

Each poster is assigned a day ( either September 20 or September 21) and will be presented in both (morning and afternoon) poster sessions of that day

Poster Session 20 September

Scalable Representation Learning in Linear Contextual Bandits with Constant Regret Guarantees
Curriculum Reinforcement Learning via Constrained Optimal Transport
Multi-Objective Coordination Graphs for the Expected Scalarised Returns with Generative Flow Models
Rate-Optimal Online Convex Optimization in Adaptive Linear Control
Mixture of Interpretable Experts for Continuous Control
Adaptive Belief Discretization for POMDP Planning
IQ-Learn: Inverse soft-Q Learning for Imitation
On Bayesian Value Function Distributions.
Minimax-Bayes Reinforcement Learning
Formulation and validation of a complete car-following model based on deep reinforcement learning
General Policy Evaluation and Improvement by Learning to Identify Few But Crucial States
A Deep Reinforcement Learning Approach to Supply Chain Inventory Management
On learning history-based policies for controlling Markov Decision Processes
Belief states of POMDPs and internal states of recurrent RL agents: an empirical analysis of their mutual information
Get Back Here: Robust Imitation by Return-to-Distribution Planning
Semi-Counterfactual Risk Minimization Via Neural Networks
Dynamic Pricing with Online Data Aggregation and Learning
Newton-based Policy Search for Networked Multi-agent Reinforcement Learning
A Learning Based Framework for Handling Uncertain Lead Times in Multi-Product Inventory Management
Group Fairness in Reinforcement Learning
Cross-Entropy Soft-Risk Reinforcement Learning
Upside-Down Reinforcement Learning Can Diverge in Stochastic Environments With Episodic Resets
$Q$-Learning for $L_p$ Robust Markov Decision Processes
Learning Efficiently Function Approximation for Contextual MDP
Risk-aware linear bandits with convex loss
Local Advantage Networks for Multi-Agent Reinforcement Learning in Dec-POMDPs
Bilinear Exponential Family of MDPs: Frequentist Regret Bound with Tractable Exploration \& Planning
RLDesigner: Toward Framing Spatial Layout Planning as a Markov Decision Process
Optimistic Risk-Aware Model-based Reinforcement Learning
Quantification of Transfer in Reinforcement Learning via Regret Bounds for Learning Agents
Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies
Scalable Deep Reinforcement Learning Algorithms for Mean Field Games
Cooperative Online Learning in Stochastic and Adversarial MDPs
Interactive Inverse Reinforcement Learning
A Unifying Framework for Reinforcement Learning and Planning
Neural Distillation as a State Representation Bottleneck in Reinforcement Learning

Poster Session 21 September

When Privacy Meets Partial Information: A Refined Analysis of Differentially Private Bandits
Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act
Near Instance-Optimal PAC Reinforcement Learning for Deterministic MDPs
Optimistic PAC Reinforcement Learning: the Instance-Dependent View
Active Exploration for Inverse Reinforcement Learning
In a Nutshell, the Human Asked for This: Latent Goals for Following Temporal Specifications
Curious Exploration via Structured World Models Yields Zero-Shot Object Manipulation
A Near-Optimal Best-of-Both-Worlds Algorithm for Online Learning with Feedback Graphs
Boosting reinforcement learning with sparse and rare rewards using Fleming-Viot particle systems
Look where you look! Saliency-guided Q-networks for visual RL tasks
Local Feature Swapping for Generalization in Reinforcement Learning
On Convergence of Neural asynchronous Q-iteration
On Reward Binarisation and Bayesian Agents
Goal-Conditioned Generators of Deep Policies
Tabular and Deep Learning of Whittle Index
Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularization
A Last Switch Dependent Analysis of Satiation and Seasonality in Bandits
Direct Advantage Estimation
Offline Credit Assignment in Deep Reinforcement Learning with Hindsight Discriminator Networks
Continuous Control with Action Quantization from Demonstrations
SFP: State-free Priors for Exploration in Off-Policy Reinforcement Learning
Learning Generative Models with Goal-conditioned Reinforcement Learning
Analyzing Thompson Sampling for Contextual Bandits via the Lifted Information Ratio
A Sparse Linear Program for Global Planning in Large MDPs
Optimism in Face of a Context: Regret Guarantees for Stochastic Contextual MDP
Sample-Efficient Reinforcement Learning of Partially Observable Markov Games
Entropy Regularized Reinforcement Learning with Cascading Networks
Regret Bounds for Satisficing in Multi-Armed Bandit Problems
A Best-of-Both-Worlds Algorithm for Bandits with Delayed Feedback
Analysis of Stochastic Processes through Replay Buffers
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback
Reinforcement Learning with a Terminator
Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality
Stochastic Bandits with Vector Losses: Minimizing $\ell^\infty$-Norm of Relative Losses
Deep Coherent Exploration for Continuous Control
First Go, then Post-Explore: the Benefits of Post-Exploration in Intrinsic Motivation

Registration

Registrations for the 15th European Workshop on Reinforcement Learning are now open! The registration includes participation in the main event activities, as well as lunch all days of the event and a social dinner on September 20th. The early bird registration period ends on July 31st August 5th. Thanks to the generosity of our sponsors, we will be able to offer to students a limited number of participation grants, in the form of fee waivers. We will offer the grants based on merit and D&I considerations. If you think you come from an underrepresented group or have financial needs, please consider applying for the grant. The grant application deadline is July 21st. The grant notification will be given by July 28th, to allow students not receiving the grant to complete the payment for the early bird registration.

Register here!

Description

The 15th European workshop on reinforcement learning (EWRL 2022) invites reinforcement learning researchers to participate in the revival of this world class event. We plan to make this an exciting event for researchers worldwide, not only for the presentation of top quality papers, but also as a forum for ample discussion of open problems and future research directions.

Reinforcement learning is an active field of research which deals with the problem of sequential decision making in unknown (and often) stochastic and/or partially observable environments. Recently there has been a wealth of both impressive empirical results, as well as significant theoretical advances. Both types of advances are of significant importance and we would like to create a forum to discuss such interesting results.

The workshop will cover a range of sub-topics including (but not limited to):

  • MDPs and Dynamic Programming
  • Temporal Difference Methods
  • Policy Optimization
  • Model-based RL and Planning
  • Exploration in RL
  • Offline RL
  • Unsupervised and Intrinsically Motivated RL
  • Representation Learning in RL
  • Lifelong and Non-stationary RL
  • Hierarchical RL
  • Partially observable RL
  • Multi-agent RL
  • Multi-objective RL
  • Transfer and Meta RL
  • Deep RL
  • Imitation Learning and Inverse RL
  • Risk-sensitive and robust RL
  • Theoretical aspects of RL
  • Applications and Real-life RL

Paper Submission

We invite submissions for the 15th European Workshop on Reinforcement Learning (EWRL 2022) from the entire reinforcement learning spectrum. The papers can present new work or give a summary of recent work of the author(s). There will be no proceedings of EWRL15. As such, papers that are intended for or have been submitted to other conferences or journals are also welcome. Submitted papers will be reviewed by the program committee in a double-blind procedure.

Submissions should follow the JMLR format adapted for EWRL linked below. There is a limit of 9 pages, excluding acknowledgments, references, and appendix. Authors of accepted papers will be allowed an additional page to prepare the camera-ready version. All accepted papers will be considered for the poster sessions. Outstanding papers will also be considered for a 20 minutes oral presentation.

Please send your inquiries by email to the organizers at ewrl2022@gmail.com.

  • Submission deadline: 1 June 2022 8 June 2022 11.59pm AOE
  • Page limit: 9 pages excluding acknowledgments, references, and appendix
  • Paper format: EWRL 2022 Author Kit
  • Paper Submissions: CMT

Important Dates

  • Paper submissions due: 1 June 2022 8 June 2022 11.59pm AOE
  • Early Registration begins: 1 July 2022
  • Participation grant application begins: 1 July 2022
  • Paper notification: 14 July 2022
  • Participation grant application ends: 21 July 2022
  • Participation grant notification: 28 July 2022
  • Early registration ends: 31 July 5 August 2022
  • Camera ready due: 1 September 2022
  • Workshop begins: 19 September 2022
  • Workshop ends: 21 September 2022

Confirmed Invited Speakers

  • Sarah Perrin (Inria Lille)
    • Topic: Scaling up MARL with MFGs and vice versa!
  • Niao He (ETH Zurich)
    • Topic: Complexities of Actor-critic Methods for Regularized MDPs and POMDPs
  • Alessandro Lazaric (Facebook AI Research)
    • Topic: Understanding (unsupervised) exploration in goal-based RL
  • Gergely Neu (Universitat Pompeu Fabra)
    • Topic: Primal-Dual Methods for Reinforcement Learning
  • Ann Nowé (Vrije Universiteit Brussel)
    • Topic: Beyond the optimal action in RL
  • Jan Peters (Technische Universität Darmstadt)
    • Topic: Robot RL: Lessons from the Physical World
  • Ciara Pike-Burke (Imperial College London)
    • Topic: Multi-armed bandits with history dependent rewards
  • Richard Sutton (University of Alberta – DeepMind)
    • Topic: An Architecture for Intelligence

Confirmed Tutorial Sessions

  • Matthieu Geist (Google Research)
    • Topic: Regularization in Reinforcement Learning
  • Matteo Pirotta (Facebook AI Research)
    • Topic: Exploration in Reinforcement Learning

Organizing Committee

General Chair

Organizing Chair

Program Chair

Program Co-Chairs

  • Mirco Mutti (Universita di Bologna- Bologna, Politecnico di Milano – Milan, Italy)

Diversity and Inclusion Chairs

Workflow Chairs

Communication Chairs

  • Amarildo Likmeta (Universita di Bologna- Bologna, Politecnico di Milano – Milan, Italy)

External Organizers

Sponsorship Program

EWRL 2022 invites companies or research institutions involved in fundamental research or application of reinforcement learning, to become official sponsors of the event. EWRL 2022 offers a single level of sponsorship, at the cost of 5000€, with the following benefits:

  • Logo display on the official EWRL 2022 website
  • Logo display on the Welcome Kit distributed during the event
  • A poster session slot for presenting your research or applications
  • Access to the EWRL recruitment database
  • Two full-access registrations to the event

Workshop Venue

Clockwise from top: Porta Nuova, Sforza Castle, La Scala, Galleria Vittorio Emanuele II, Milano Centrale railway station, Arch of Peace and Milan Cathedral.
Credits: Wikimedia

EWRL2022 takes place in Milan, Italy. The precise address is:

Aula De Carli – Politecnico di Milano – Campus Bovisa

Via Candiani, 72 – 20158 – Milano (MI) – Italy

Reaching the Venue

Milan is very easy to travel to by car, train or airplane. The easiest way to reach Milan is by train, with many daily trains reaching the stations of Milano Centrale, Milano Porta Garibaldi or Milano Cadorna. By airplane, the best airports for reaching the workshop venue are Milano Malpensa or Milano Linate. You can also come via the Orio al Serio (Bergamo) Airport. After having reached Milan by train or airplane, to reach the Workshop venue you can chose the following options:

  • If you are in Milano Malpensa Airport, you can take the Malpensa Express Train, directly from the airport every 30 mins. The train will have final destination either Milano Cadorna, or Milano Centrale but in both cases will stop in the station of Milano Bovisa Politecnico, where the workshop will take place. So we suggest not to take the train to the final destination, but rather stop directly in Milano Bovisa Politecnico. Milan can also be reached by bus departing from Malpensa Airport. In this case you will reach Milano Centrale train station (in around 1 hour). From Milano Centrale, you can take every train passing from local train lines S1, S2 or S13.
  • If you reach Milan in Milano Linate Airport, you will first need to reach a train station either by Taxi or by bus number 73. The easiest station to reach is Milano centrale by taking bus 73 at the airport and then switching to bus 91. Once you reach a train station, you can take every train passing from the local train lines S1, S2 or S13, as they will stop in Milano Bovisa Politecnico train station. These buses and trains are easily accessible with a regular single use ATM metro ticket.
  • If you reach Milan in the Orio al Serio (Bergamo) Airport, sadly no railway connection to Milan is available. Nevertheless, you can take a Taxi or better yet a bus from the airport directly to Milano Centrale. The bus can be taken directly at the airport exit, it is available every 20-30 minutes and it reaches Milano Centrale Station in 50-60 minutes. From Milano Centrale, you can take every train passing from local train lines S1, S2 or S13 to reach the Milano Bovisa Politecnico train station.
  • If you reach Milano by train and you do not pass from Milano Bovisa Politecnico train station before reaching your final destination, the easiest way to the workshop venue is to take any train passing from local train lines S1, S2 or S13.

Program Committee

Aditya Modi
Ahmed Touati
Alain Dutech
Aldo Pacchiano
Alessandro Lazaric
Alessio Russo
Alexis Jacq
Amarildo Likmeta
André Biedenkapp
Andrea Tirinzoni
Boris Belousov
Brendan O’Donoghue
Carlo D’Eramo
Christos Dimitrakakis
Ciara Pike-Burke
Claire Vernade
Conor F Hayes
David Abel
David Brandfonbrener
David Meger
Davide Tateo
Debabrota Basu
Divya Grover
Dongruo Zhou
Dylan R Ashley
Elena Smirnova
Emilie Kaufmann
Emmanuel Esposito
Eugenio Bargiacchi
Felipe Leno da Silva
Felix Berkenkamp
Francesco Faccio
Fredrik Heintz
Gergely Neu
Germano Gabbianelli
Gianluca Drappo
Giorgia Ramponi
Giorgio Manganini
Giuseppe Canonaco
Glen Berseth
Hannes Eriksson
Hao Liu
Harsh Satija
Hélène Plisnier
Ido Greenberg
Jens Kober
Johan Källström
Jonathan J Hunt
Julien Perolat
Kamyar Azizzadenesheli
Khaled Eldowa
Khazatsky Alexander
Khimya Khetarpal
Kianté Brantley
Léonard Hussenot
Lior Shani
Martin Klissarov
Martino Bernasconi
Mathieu Reymond
Matteo Papini
Matteo Pirotta
Matthew E. Taylor
Matthieu Geist
Nico Montali
Nicolò A Cesa-Bianchi
Olivier Bachem
Omar Darwiche Domingues
Paolo Bonetti
Patrick Mannion
Patrick Saux
Peter Vamplew
Philippe Preux
Pierluca D’Oro
Pierre Liotet
Pierre Menard
Prashanth L.A.
Puze Liu
Quanquan Gu
Rafael Rodriguez Sanchez
Rahul Savani
Riad Akrour
Riccardo Poiani
Richard S Sutton
Robert Dadashi
Roberta Raileanu
Romina Abachi
Ronald Ortner
Roxana Radulescu
Rui YUAN
Samuele Tosatto
Shangdong Yang
Simon Du
Tal Lancewicki
Taylor W Killian
Tengyang Xie
Thanh Nguyen-Tang
Tian Xu
Tianwei Ni
Tom Schaul
Tom Zahavy
Tommaso R Cesari
Weitong ZHANG
Yannis Flet-Berliac
Yi Su
Yishay Mansour
Younggyo Seo

Accepted Papers

Direct Advantage Estimation
Pan, Hsiao-Ru*; Gürtler, Nico; Neitz, Alexander; Schölkopf, Bernhard Accept (Oral)
Newton-based Policy Search for Networked Multi-agent Reinforcement Learning
Manganini, Giorgio*; Fioravanti, Simone; Ramponi, Giorgia Accept (Oral)
A Last Switch Dependent Analysis of Satiation and Seasonality in Bandits
Laforgue, Pierre; Clerici, Giulia*; Cesa-Bianchi, Nicolò; Gilad-Bachrach, Ran Accept (Oral)
Local Feature Swapping for Generalization in Reinforcement Learning
Bertoin, David*; Rachelson, Emmanuel Accept (Oral)
Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality
Zahavy, Tom*; Schroecker, Yannick; Behbahani, Feryal; Baumli, Kate; Flennerhag, Sebastian; Hou, Shaobo; Singh, Satinder Accept (Oral)
Dynamic Pricing with Online Data Aggregation and Learning
Genalti, Gianmarco*; Mussi, Marco; Nuara, Alessandro; Gatti, Nicola Accept (Oral)
Scalable Deep Reinforcement Learning Algorithms for Mean Field Games
Lauriere, Mathieu; Perrin, Sarah*; Girgin, Sertan; Muller, Paul; Jain, Ayush; Cabannes, Théophile; Piliouras, Georgios; Perolat, Julien; Élie, Romuald; Pietquin, Olivier; Geist, Matthieu Accept (Oral)
Optimistic PAC Reinforcement Learning: the Instance-Dependent View
Tirinzoni, Andrea*; Al Marjani, Aymen; Kaufmann, Emilie Accept (Oral)
Group Fairness in Reinforcement Learning
Satija, Harsh*; Lazaric, Alessandro; Pirotta, Matteo; Pineau, Joelle Accept (Oral)
IQ-Learn: Inverse soft-Q Learning for Imitation
Garg, Divyansh*; Chakraborty, Shuvam; Cundy, Chris; Song, Jiaming; Geist, Matthieu; Ermon, Stefano Accept (Oral)
In a Nutshell, the Human Asked for This: Latent Goals for Following Temporal Specifications
G. León, Borja*; Shanahan, Murray; Belardinelli, Francesco Accept (Poster)
A Deep Reinforcement Learning Approach to Supply Chain Inventory Management
Stranieri, Francesco*; Stella, Fabio Accept (Poster)
Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act
Jacq, Alexis*; Ferret, Johan; Geist, Matthieu; Pietquin, Olivier Accept (Poster)
Continuous Control with Action Quantization from Demonstrations
Dadashi, Robert; Hussenot, Léonard*; Vincent, Damien; Girgin, Sertan; Raichuk, Anton; Geist, Matthieu; Pietquin, Olivier Accept (Poster)
A Unifying Framework for Reinforcement Learning and Planning
Moerland, Thomas M*; Broekens, Joost; Plaat, Aske; Jonker, Catholijn M Accept (Poster)
Upside-Down Reinforcement Learning Can Diverge in Stochastic Environments With Episodic Resets
Strupl, Miroslav*; Faccio, Francesco; Ashley, Dylan R; Schmidhuber, Jürgen ; Srivastava, Rupesh Kumar Accept (Poster)
Stochastic Bandits with Vector Losses: Minimizing $\ell^\infty$-Norm of Relative Losses
Shang, Xuedong*; Shao, Han; Qian, Jian Accept (Poster)
Semi-Counterfactual Risk Minimization Via Neural Networks
Aminian, Gholamali*; Vega, Roberto I; Rivasplata, Omar; Toni, Laura; Rodrigues, Miguel Accept (Poster)
When Privacy Meets Partial Information: A Refined Analysis of Differentially Private Bandits
Azize, Achraf*; Basu, Debabrota Accept (Poster)
Deep Coherent Exploration for Continuous Control
Zhang, Yijie*; van Hoof, Herke Accept (Poster)
Bilinear Exponential Family of MDPs: Frequentist Regret Bound with Tractable Exploration \& Planning
Ouhamma, Reda*; Basu, Debabrota; Maillard, Odalric Accept (Poster)
Neural Distillation as a State Representation Bottleneck in Reinforcement Learning
Guillet, Valentin*; Wilson, Dennis; Aguilar-Melchor, Carlos; Rachelson, Emmanuel Accept (Poster)
Tabular and Deep Learning of Whittle Index
Robledo, Francisco*; Ayesta, Urtzi; Avrachenkov, Konstantin; Borkar, Vivek Accept (Poster)
Learning Efficiently Function Approximation for Contextual MDP
Levy, Orin*; Mansour, Yishay Accept (Poster)
Optimism in Face of a Context: Regret Guarantees for Stochastic Contextual MDP
Levy, Orin*; Mansour, Yishay Accept (Poster)
Look where you look! Saliency-guided Q-networks for visual RL tasks
Bertoin, David*; Zouitine, Adil; Zouitine, Mehdi; Rachelson, Emmanuel Accept (Poster)
Quantification of Transfer in Reinforcement Learning via Regret Bounds for Learning Agents
Tuynman, Adrienne; Ortner, Ronald* Accept (Poster)
Regret Bounds for Satisficing in Multi-Armed Bandit Problems
Michel, Thomas; Hajiabolhassan, Hossein; Ortner, Ronald* Accept (Poster)
Risk-aware linear bandits with convex loss
Saux, Patrick*; Maillard, Odalric Accept (Poster)
Interactive Inverse Reinforcement Learning
Kleine Büning, Thomas*; George, Anne-Marie; Dimitrakakis, Christos Accept (Poster)
Reinforcement Learning with a Terminator
Guy, Tennenholtz*; Merlis, Nadav; Shani, Lior; Mannor, Shie; Shalit, Uri; Chechik, Gal; Hallak, Assaf; Dalal, Gal Accept (Poster)
On Convergence of Neural asynchronous Q-iteration
Smirnova, Elena* Accept (Poster)
Curriculum Reinforcement Learning via Constrained Optimal Transport
Klink, Pascal; Yang, Haoyi; D’Eramo, Carlo*; Peters, Jan; Pajarinen, Joni Accept (Poster)
Cross-Entropy Soft-Risk Reinforcement Learning
Greenberg, Ido*; Chow, Yinlam; Ghavamzadeh, Mohammad; Mannor, Shie Accept (Poster)
Active Exploration for Inverse Reinforcement Learning
Lindner, David*; Krause, Andreas; Ramponi, Giorgia Accept (Poster)
Offline Credit Assignment in Deep Reinforcement Learning with Hindsight Discriminator Networks
Ferret, Johan*; Pietquin, Olivier; Geist, Matthieu Accept (Poster)
Local Advantage Networks for Multi-Agent Reinforcement Learning in Dec-POMDPs
Avalos, Raphael*; Reymond, Mathieu; Nowé, Ann; Roijers, Diederik M Accept (Poster)
Sample-Efficient Reinforcement Learning of Partially Observable Markov Games
Liu, Qinghua*; Szepesvari, Csaba; Jin, Chi Accept (Poster)
General Policy Evaluation and Improvement by Learning to Identify Few But Crucial States
Faccio, Francesco*; Ramesh, Aditya; Herrmann, Vincent; Harb, Jean; Schmidhuber, Jürgen Accept (Poster)
Goal-Conditioned Generators of Deep Policies
Faccio, Francesco*; Herrmann, Vincent; Ramesh, Aditya; Kirsch, Louis; Schmidhuber, Jürgen Accept (Poster)
A Learning Based Framework for Handling Uncertain Lead Times in Multi-Product Inventory Management
Meisheri, Hardik*; Nath, Somjit; Baranwal, Mayank; Khadilkar, Harshad Accept (Poster)
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback
Jin, Tiancheng; Lancewicki, Tal*; Luo, Haipeng; Mansour, Yishay; Rosenberg, Aviv Accept (Poster)
Rate-Optimal Online Convex Optimization in Adaptive Linear Control
Cassel, Asaf B*; Cohen, Alon; Koren, Tomer Accept (Poster)
Cooperative Online Learning in Stochastic and Adversarial MDPs
Lancewicki, Tal*; Rosenberg, Aviv; Mansour, Yishay Accept (Poster)
Multi-Objective Coordination Graphs for the Expected Scalarised Returns with Generative Flow Models
Hayes, Conor F*; Verstraeten, Timothy; Roijers, Diederik M; Howley, Enda; Mannion, Patrick Accept (Poster)
Get Back Here: Robust Imitation by Return-to-Distribution Planning
Cideron, Geoffrey*; Pietquin, Olivier; Dadashi, Robert; Dulac-Arnold, Gabriel; Tabanpour, Baruch; Geist, Matthieu; Hussenot, Léonard; Curi, Sebastian; Girgin, Sertan Accept (Poster)
Analysis of Stochastic Processes through Replay Buffers
Di-Castro, Shirli*; Mannor, Shie; Di Castro, Dotan Accept (Poster)
Mixture of Interpretable Experts for Continuous Control
Tateo, Davide*; Akrour, Riad; Peters, Jan Accept (Poster)
On Reward Binarisation and Bayesian Agents
Catt, Elliot*; Hutter, Marcus; Veness, Joel Accept (Poster)
$Q$-Learning for $L_p$ Robust Markov Decision Processes
Kumar, Navdeep*; Wang, Kaixin; Levy, Kfir; Mannor, Shie Accept (Poster)
First Go, then Post-Explore: the Benefits of Post-Exploration in Intrinsic Motivation
Yang, Zhao*; Moerland, Thomas M; Preuss, Mike; Plaat, Aske Accept (Poster)
Curious Exploration via Structured World Models Yields Zero-Shot Object Manipulation
Sancaktar, Cansu *; Blaes, Sebastian; Martius, Georg Accept (Poster)
Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies
YUAN, Rui*; Gower, Robert M; Lazaric, Alessandro; Du, Simon; Xiao, Lin Accept (Poster)
A Near-Optimal Best-of-Both-Worlds Algorithm for Online Learning with Feedback Graphs
Rouyer , Chloé *; van der Hoeven, Dirk; Cesa-Bianchi, Nicolò; Seldin, Yevgeny Accept (Poster)
RLDesigner: Toward Framing Spatial Layout Planning as a Markov Decision Process
Kakooee, Reza*; Dillenburger, Benjamin Accept (Poster)
Belief states of POMDPs and internal states of recurrent RL agents: an empirical analysis of their mutual information
Lambrechts, Gaspard*; Bolland, Adrien; Ernst, Damien Accept (Poster)
Lifting the Information Ratio: An Information-Theoretic Analysis of Thompson Sampling for Contextual Bandits
Neu, Gergely; Olkhovskaya, Julia; Papini, Matteo*; Schwartz, Ludovic Accept (Poster)
Formulation and validation of a complete car-following model based on deep reinforcement learning
Hart, Fabian* Accept (Poster)
Near Instance-Optimal PAC Reinforcement Learning for Deterministic MDPs
Tirinzoni, Andrea*; Al Marjani, Aymen; Kaufmann, Emilie Accept (Poster)
Scalable Representation Learning in Linear Contextual Bandits with Constant Regret Guarantees
Tirinzoni, Andrea*; Papini, Matteo; Touati, Ahmed; Lazaric, Alessandro; Pirotta, Matteo Accept (Poster)
Learning Generative Models with Goal-conditioned Reinforcement Learning
Vargas Vieyra, Mariana*; Menard, Pierre Accept (Poster)
Adaptive Belief Discretization for POMDP Planning
Grover, Divya*; Dimitrakakis, Christos Accept (Poster)
Boosting reinforcement learning with sparse and rare rewards using Fleming-Viot particle systems
Mastropietro, Daniel G*; Majewski, Szymon; Ayesta, Urtzi; Jonckheere, Matthieu Accept (Poster)
A Sparse Linear Program for Global Planning in Large MDPs
Neu, Gergely; Okolo, Nneka M* Accept (Poster)
A Best-of-Both-Worlds Algorithm for Bandits with Delayed Feedback
Masoudian, Saeed*; Zimmert, Julian; Seldin, Yevgeny Accept (Poster)
Entropy Regularized Reinforcement Learning with Cascading Networks
Shilova, Alena; Della Vecchia, Riccardo; Preux, Philippe; Akrour, Riad* Accept (Poster)
Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularization
Patil, Gandharv*; L.A., Prashanth; Precup, Doina Accept (Poster)
On learning history-based policies for controlling Markov Decision Processes
Patil, Gandharv*; Mahajan, Aditya; Precup, Doina Accept (Poster)
Minimax-Bayes Reinforcement Learning
Kleine Büning, Thomas; Dimitrakakis, Christos; Eriksson, Hannes; Grover, Divya; Jorge, Emilio* Accept (Poster)
On Bayesian Value Function Distributions.
Jorge, Emilio; Eriksson, Hannes*; Dimitrakakis, Christos; Basu, Debabrota; Grover, Divya Accept (Poster)
TempRL: Temporal Priors for Exploration in Off-Policy Reinforcement Learning
Bagatella, Marco*; Christen, Sammy; Hilliges, Otmar Accept (Poster)
Optimistic Risk-Aware Model-based Reinforcement Learning
Abachi, Romina*; Farahmand, Amir-massoud Accept (Poster)

Photos from the Workshop

Sponsors

Code of Conduct

The official EWRL 2022 Code of Conduct can be found here


%d bloggers like this: