EWRL14 (2018)

Here you can find information about the workshop structure, dates, speakers. Please contact the organizers Matteo Pirotta, Ronan FruitMathieu Seurin or Florian Strub and if you have any questions.

The registration is CLOSED !!! 

Note: If you want attend EWRL and you need an invitation letter for VISA purposes, please contact us directly.

The SOCIAL DINNER will take place on Tue Oct 2nd at 19:30.

Note: There will be no proceedings of EWRL14.

The 14th European Workshop on Reinforcement Learning (EWRL 2018)

Dates:    October 1-3rd, 2018
Location: École nationale supérieure d’arts et métiers (ENSAM), 8 Boulevard Louis XIV, 59800 Lille, France

[description] [submission] [dates] [committees] [papers] [registration] [venue] [schedule] [sponsors] [book a room]


The 14th European workshop on reinforcement learning (EWRL 2018) invites reinforcement learning researchers to participate in the revival of this world-class event. We plan to make this an exciting event for researchers worldwide, not only for the presentation of top quality papers but also as a forum for ample discussion of open problems and future research directions. EWRL 2018 will consist of 3+ tutorials (yes!), 10+ keynote talks, contributed paper presentations, discussion sessions spread over a three-day period, and a poster session.

Reinforcement learning is an active field of research which deals with the problem of sequential decision making in unknown (and often) stochastic and/or partially observable environments. Recently there has been a wealth of both impressive empirical results, as well as significant theoretical advances. Both types of advances are of significant importance and we would like to create a forum to discuss such interesting results.

The workshop will cover a range of sub-topics including (but not limited to):

  • Exploration/Exploitation
  • Function approximation in RL
  • Theoretical aspects of RL
  • Policy search methods
  • Empirical evaluations in RL
  • Kernel methods for RL
  • Partial observable RL
  • Bayesian RL
  • Multi-agent RL
  • Risk-sensitive RL
  • Financial RL
  • Knowledge Representation in RL
  • Neural RL

Confirmed Invited Speakers

  • Richard Sutton
  • Martin Riedmiller
  • Joelle Pineau
  • Nicolò Cesa-Bianchi
  • Tze Leung Lai
  • Remi Munos
  • Gergely Neu
  • Audrey Durand
  • Karl Tuyls
  • Katja Hofmann


  • Advanced Topics in Exploration: Csaba Szepesvári and Tor Lattimore
  • Deep Reinforcement Learning: Hado Van Hasselt
  • Towards Safe Reinforcement Learning: Andreas Krause and Felix Berkenkamp

Paper Submission

We invite submissions for the 14th European Workshop on Reinforcement Learning (EWRL 2018) from the entire reinforcement learning spectrum. Authors can submit a 2-6 pages paper in JMLR format (excluding references) that will be reviewed by the program committee in a double-blind procedure. The papers can present new work or give a summary of recent work of the author(s). All papers will be considered for the poster sessions. Outstanding long papers (4-6 pages) will also be considered for a 20 minutes oral presentation. There will be no proceedings of EWRL14.

Camera ready: You can use to additional pages (the limit is now 8 pages + reference and appendix). We have updated the jmlr2e.sty for EWRL14. You need to update your source file as follows:

  • remove the \editor command
  • replace \jmlrheading with \ewrlheading using the following structure:
\ewrlheading{14}{2018}{October 2018, Lille, France}{}

Important Dates

  • Paper submissions due: 15 June 2018, 23:59 CET 21 June 2018, 23:59 CET
  • Notification of acceptance: end of July 2018
  • Camera ready due: 14 September 2018
  • Workshop begins: 1 October 2018
  • Workshop ends: 3 October 2018

Organizing Committee

The SequeL team, INRIA Lille. In particular

External organizers

  • Jérémie Mary (Criteo Research)
  • Olivier Pietquin (Google Brain)
  • Gabriel Dulac-Arnold (Squirrel Group)

Program Committee

  • Marc Abeille
  • Riad Akrour
  • Oleg Arenz
  • Lilian Besson
  • Roberto Calandra
  • Daniele Calandriello
  • Christian Daniel
  • Christos Dimitrakakis
  • Layla Elasri
  • Ronan Fruit
  • Victor Gabillon
  • Pratik Gajane
  • Matthieu Geist
  • Mohammad Gheshlaghi Azar
  • Anna Harutyunyan
  • Maximilian Hüttenrauch
  • Anders Jonsson
  • Emilie Kaufmann
  • Johannes Kirschner
  • Akshay Krishnamurthy
  • Alessandro Lazaric
  • Odalric-Abrym Maillard
  • Timothy Mann
  • Jérémie Mary
  • Remi Munos
  • Gergely Neu
  • Ann Nowe
  • Laurent Orseau
  • Ronald Ortner
  • Ian Osband
  • Simone Parisi
  • Vianney Perchet
  • Julien Perolat
  • Pierre Perrault
  • Olivier Pietquin
  • Joelle Pineau
  • Bilal Piot
  • Matteo Pirotta
  • Philippe Preux
  • Marcello Restelli
  • Mathieu Seurin
  • Florian Strub
  • Sadegh Talebi
  • Herke van Hoof
  • Claire Vernade


Accepted Papers


  • [D1] Fighting Boredom in Recommender Systems with Linear Reinforcement Learning. Romain Warlop, Alessandro Lazaric and Jérémie Mary.
  • [D2] Learning good policies from suboptimal demonstrations. Yuxiang Li, Katja Hofmann and Ian Kash.
  • [D2] Transferring Value Functions via Variational Methods. Andrea Tirinzoni, Rafael Rodriguez and Marcello Restelli.
  • [D2] Safely Exploring Policy Gradient. Matteo Papini, Andrea Battistello and Marcello Restelli.
  • [D1] Constraint-Space Projection Direct Policy Search. Riad Akrour, Jan Peters and Gerhard Neumann.
  • [D2] Exponential Weights on the Hypercube in Polynomial Time. Sudeep Raja Putta.
  • [D2] Directed Policy Gradient for Safe Reinforcement Learning with Human Advice. Hélène Plisnier, Denis Steckelmacher, Tim Brys, Diederik Roijers and Ann Nowé.
  • [D2] When Simple Exploration is Sample Efficient: Identifying Sufficient Conditions for Random Exploration to Yield PAC RL Algorithms. Yao Liu and Emma Brunskill.
  • [D1] Leveraging Observational Learning for Exploration in Bandits. Audrey Durand, Andrei Lupu and Doina Precup.
  • [D1] A0C: Alpha Zero in Continuous Action Space. Thomas Moerland, Joost Broekens, Aske Plaat and Catholijn Jonker.
  • [D1] When Gaussian Processes Meet Combinatorial Bandits: GCB. Guglielmo Maria Accabi, Alessandro Nuara, Francesco Trovò, Nicola Gatti and Marcello Restelli.
  • [D1] Anderson Acceleration for Reinforcement Learning. Matthieu Geist and Bruno Scherrer.
  • [D1] Planning in Hierarchical Reinforcement Learning: Guarantees for Using Local Policies. Tom Zahavy, Avinatan Hasidim, Haim Kaplan and Yishay Mansour.
  • [D1] A Parameter Investigation of the ϵ-greedy Exploration Ratio Adaptation Method in Multi-agent Reinforcement Learning. Takuya Okano and Itsuki Noda.
  • [D2] Adaptive black-box optimization got easier: HCT needs only local smoothness. Xuedong Shang, Emilie Kaufmann and Michal Valko.
  • [D2] Counting to Explore and Generalize in Text-based Games. Xingdi Yuan, Marc-Alexandre Côté, Alessandro Sordoni, Matthew Hausknecht and Adam Trischler.
  • [D1] Mean squared advantage minimization as a consequence of entropic policy improvement regularization. Boris Belousov and Jan Peters.
  • [D1] Reinforcement learning for supply chain optimization. Lukas Kemmer, Henrik von Kleist, Diego María De Grimaudet De Rochebouët, Nikolaos Tziortziotis and Jesse Read.
  • [D1] Randomised Bayesian Least-Squares Policy Iteration. Nikolaos Tziortziotis, Christos Dimitrakakis and Michalis Vazirgiannis.
  • [D2] Towards learning to best respond when losing control. Richard Klima, Daan Bloembergen, Michael Kaisers and Karl Tuyls.
  • [D2] KL-UCRL Revisited: Variance-Aware Regret Bounds. M. Sadegh Talebi and Odalric-Ambrym Maillard.
  • [D1] Where Did My Optimum Go?: An Empirical Analysis of Gradient Descent Optimization in Policy Gradient Methods. Peter Henderson, Joshua Romoff and Joelle Pineau.
  • [D2] Generalizing Across Multi-Objective Reward Functions in Deep Reinforcement Learning. Eli Friedman and Fred Fontaine.
  • [D1] Randomized Value Functions via Multiplicative Normalizing Flows. Ahmed Touati, Harsh Satija, Joshua Romoff, Joelle Pineau and Pascal Vincent.
  • [D2] Sample Efficient Learning with Feature Selection for Factored MDPs. Zhaohan Guo and Emma Brunskill.
  • [D2] A Fitted-Q Algorithm for Budgeted MDPs. Nicolas Carrara, Olivier Pietquin, Romain Laroche, Tanguy Urvoy and Jean-Léon Bouraoui.
  • [D2] The Potential of the Return Distribution for Exploration in Reinforcement Learning. Thomas Moerland, Joost Broekens and Catholijn Jonker.
  • [D1] Multiple-Step Greedy Policies in Online and Approximate Reinforcement Learning. Yonathan Efroni, Gal Dalal, Bruno Scherrer and Shie Mannor.
  • [D2] Recovering Bandits. Ciara Pike-Burke and Steffen Grunewalder.
  • [D1] Configurable Markov Decision Processes. Alberto Maria Metelli, Mirco Mutti and Marcello Restelli.
  • [D2] An Empirical Study of Least-Squares Algorithms in Reinforcement Learning. Howard Huang.
  • [D2] Stable, Practical and On-line Bootstrapped Conservative Policy Iteration. Denis Steckelmacher, Hélène Plisnier, Diederik M. Roijers and Ann Nowé.
  • [D1] Reinforcement Learning with Wasserstein Distance Regularisation, with Applications to Multipolicy Learning. Mohammed Abdullah, Moez Draief and Aldo Pacchiano.
  • [D2] Neural Value Function Approximation in Continuous State Reinforcement Learning Problems. Martin Gottwald, Mingpan Guo and Hao Shen.
  • [D1] Thompson Sampling for the non-stationary Corrupt Multi-Armed Bandit. Reda Alami.
  • [D1] Safe Policy Improvement with Baseline Bootstrapping. Romain Laroche and Paul Trichelair.
  • [D1] Adaptively Tracking the Best Arm with an Unknown Number of Distribution Changes. Peter Auer, Pratik Gajane and Ronald Ortner.
  • [D2] Soft Safe Policy Improvement with Baseline Bootstrapping. Kimia Nadjahi, Romain Laroche and Rémi Tachet des Combes.
  • [D1] Intra-day Bidding Strategies for Storage Devices Using Deep Reinforcement Learning. Ioannis Boukas, Damien Ernst, Anthony Papavasiliou and Bertrand Cornelusse.
  • [D2] TD-Regularized Actor-Critic Methods. Simone Parisi, Voot Tangkaratt, Jan Peters and Mohammad Khan.
  • [D2] Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning. Tom Zahavy, Matan Haroush, Nadav Merlis, Daniel J. Mankowitz and Shie Mannor.
  • [D1] Combining No-regret and Q-learning. Ian Kash and Katja Hofmann.
  • [D2] ACCME: Actively Compressed Conditional Mean Embeddings for Model-Based Reinforcement Learning. Ronnie Stafford and John Shawe-Taylor.
  • [D1] Dynamic filtering in deep reinforcement learning. Raymond Chua and Rui Ponte Costa.


The registration is closed! The registration will be closed as soon as we reach 200 registrations (due to capacity constraints). The registration page is:



  • The registration will remain open until one week before the start of the workshop
  • The early registration ends September 5th, 2018. After that, the price will increase.

If you need an invitation letter for VISA purposes, please contact us directly.

Cancellation/Refund Policy

All cancellations must be notified by e-mail to the Conference Organizers. The following conditions will be applied:

  • Until August 31, 2018: an amount of € 50 cancellation fee will be withheld for an administrative fee;
  • From September 1st, 2018: no refund.

If you are unable to attend the conference, you can request the transfer of your registration to another participant by notifying the Conference Organizers by e-mail, before the conference.

Workshop Venue


EWRL14 takes place at in Lille, France. The precise address is:

École Nationale Supérieure d’Arts et Métiers (Lille)

8 Boulevard Louis XIV, 59800 Lille, France

Travelling to Lille

Lille is very easy to access either by road, or fast train, or plane. We sketch the simplest way to come to Lille from various places in the world. Many other ways are possible.

  • probably the easiest way to come to Lille from America, Asia, Oceania, Africa, and non-neighbouring countries (i.e., countries others from Belgium, the Netherlands or Germany) is to fly to Paris Charles de Gaulle airport (CDG). CDG is connected to almost all countries worldwide. Once in CDG, one catches the fast train (TGV) at the train station located in terminal 2 of CDG. 50 minutes later, you take off in one of the 2 train stations of Lille (located a few hundred meters from each other).
  • you may also fly to London, UK, and connect to Saint Pancras train station, board on the Eurostar (fast train between the UK and France), travel under the Channel and take off in Lille train station.
  • from Belgium, the Netherlands and Germany, it may be even easier to take a Thalys fast train straight to Lille (from Amsterdam, Brussels, or Köln).
  • from Brussels South Charleroi Airport there is a direct bus to Lille (https://www.flibco.com/en)

Looking for a room? info at ICML 2015 page

The social dinner will be held at Couvent des Minimes de Lille on Tue October 2nd at 19:30. See you there!

Workshop Schedule

Mon 1 (tutorials)

9:00 – 10:00 Check-in and welcoming coffee

9:45 – 10:00 Opening remarks

10:00 – 12:00 Advanced topics in exploration: The role of randomization for exploration in bandits and RL (Csaba Szepesvári and Tor Lattimore)

12:00 – 14:00 Lunch break (on your own)

14:00 – 16:00 Deep Reinforcement Learning (Hado Van Hasselt)

16:00 – 16:30 Coffee break

16:30 – 18:30 Towards Safe Reinforcement Learning (Andreas Krause and Felix Berkenkamp)

Tue 2

8:00 – 9:00 Check-in

9:00 – 9:45 Invited talk: Richard Sutton

9:45 – 10:05 Contributed talk (Anderson Acceleration for Reinforcement Learning)

10:05 – 10:55  Poster session 1 (with Coffee break)

10:55 – 11:40 Invited talk: Tze Leung Lai

11:40 – 12:00 Contributed talk (When Gaussian Processes Meet Combinatorial Bandits: GCB)

12:00 – 14:00 Lunch break (on your own)

14:00 – 14:20 Contributed talk (Fighting Boredom in Recommender Systems with Linear Reinforcement Learning)

14:20 – 15:05 Invited talk: Nicolò Cesa-Bianchi

15:05 – 15:25 Contributed talk (A0C: Alpha Zero in Continuous Action Space)

15:25 – 16:10 Invited talk: Martin Riedmiller

16:10 – 17:10 Poster session 2 (with Coffee break)

17:10 – 17:55 Invited talk: Gergely Neu

17:55 – 18:15 Contributed talk (Constraint-Space Projection Direct Policy Search)

Wed 3

9:00 – 9:20 Contributed talk (Learning good policies from suboptimal demonstrations)

9:20 – 10:05 Invited talk: Joelle Pineau

10:05 – 10:55  Poster session 3 (with Coffee break)

10:55 – 11:40 Invited talk: Rémi Munos

11:40 – 12:00 Contributed talk (Directed Policy Gradient for Safe Reinforcement Learning with Human Advice)

12:00 – 14:00 Lunch break (on your own)

14:00 – 14:20 Contributed talk (Towards learning to best respond when losing control)

14:20 – 15:05 Invited talk: Karl Tuyls

15:05 – 15:25 Contributed talk (Transferring Value Functions via Variational Methods)

15:25 – 16:10 Invited talk: Katja Hofmann

16:10 – 17:10 Poster session 4 (with Coffee break)

17:10 – 17:55 Invited talk: Audrey Durand

17:55 – 18:15 Contributed talk (Counting to Explore and Generalize in Text-based Games)

18:15 – 18:30 Closing Remarks










amazon_logo                                              google




%d bloggers like this: