EWRL12 (2015)
The 12th European Workshop on Reinforcement Learning (EWRL 2015)
Dates: 10 – 11 July 2015
Location: Lille, France (2-day ICML Workshop)
[description] [keynotes] [schedule] [papers] [submission] [dates] [sponsors] [committee] [registration] [venue] [photos]
Description
The 12th European workshop on reinforcement learning (EWRL 2015) invites reinforcement learning researchers to participate in the revival of this world class event. We plan to make this an exciting event for researchers worldwide, not only for the presentation of top quality papers, but also as a forum for ample discussion of open problems and future research directions. EWRL 2015 will consist of five keynote talks, contributed paper presentations, discussion sessions spread over a two day period.
Reinforcement learning is an active field of research which deals with the problem of sequential decision making in unknown (and often) stochastic and/or partially observable environments. Recently there has been a wealth of both impressive empirical results, as well as significant theoretical advances. Both types of advances are of significant importance and we would like to create a forum to discuss such interesting results.
The workshop will cover a range of sub-topics including (but not limited to):
- Exploration/Exploitation and multi-armed bandit
- Function approximation and large scale RL
- Theoretical aspects of RL
- Policy search methods
- Actor-critic methods
- Online learning methods
- Adversarial RL
- Risk-sensitive RL
- Transfer and multi-task RL
- Empirical evaluations in RL
- Partial observable RL
- Imitation learning and inverse RL
- Bayesian RL
- Multi-agent RL
- Knowledge Representation in RL
- Applications of RL
- Open problems
Keynote Speakers
- Marcus Hutter – Australian National University – Canberra, Australia
- Title: Universal Reinforcement Learning
- Abstract: There is great interest in understanding and constructing generally intelligent systems approaching and ultimately exceeding human intelligence. Universal AI is such a mathematical theory of machine super-intelligence. More precisely, AIXI is an elegant parameter-free theory of an optimal reinforcement learning agent embedded in an arbitrary unknown environment that possesses essentially all aspects of rational intelligence. The theory reduces all conceptual AI problems to pure computational questions. After a brief discussion of its philosophical, mathematical, and computational ingredients, I will give a formal definition and measure of intelligence, which is maximized by AIXI.
AIXI can be viewed as the most powerful Bayes-optimal sequential decision maker, for which I will present general optimality results. This also motivates some variations such as knowledge-seeking and optimistic agents, and feature reinforcement learning. Finally I present some recent approximations, implementations, and applications of this modern top-down approach to AI.
- Thomas G. Diettrerich – Oregon State University – Corvallis, Oregon, USA
- Title: Efficient Sampling for Simulator-Defined MDPs
- Abstract: Extended value iteration can compute confidence intervals on the action values of an MDP based on samples from that MDP. Different confidence interval methods (e.g., Hoeffding bound, Empirical Bernstein Bound, Weissman et al. L1 multinomial interval, etc.) at each state lead to different confidence intervals throughout the MDP. This talk will address two questions. First, given a strong simulator for an MDP, a fixed policy, and a sampling budget, what is the best way to draw samples in order to obtain the tightest bound on the value of the policy in the start state? That is, what combination of confidence interval method and sampling strategy will give the tightest bounds? Second, how should we draw samples in order to simultaneously optimize the policy and obtain the tightest bounds on the resulting policy? Again, what confidence interval method and what sampling strategy should we use. I will present experiments that suggest partial answers to both questions.
- David Silver – Google Deep Mind – London, UK
- Lihong Li – Microsoft Research
- Title: The Online Discovery Problem and Its Application to Lifelong Reinforcement Learning
- Abstract: Transferring knowledge across a sequence of related tasks is an important challenge in reinforcement learning. Despite much encouraging empirical evidence that shows benefits of transfer, there has been very little theoretical analysis. In this paper, we study a class of lifelong reinforcement-learning problems: the agent solves a sequence of tasks modeled as finite Markov decision processes (MDPs), each of which is from a finite set of MDPs with the same state/action spaces and different transition/reward functions. Inspired by the need for cross-task exploration in lifelong learning, we formulate a novel online discovery problem and give an optimal learning algorithm to solve it. Such results allow us to develop a new lifelong reinforcement-learning algorithm, whose overall sample complexity in a sequence of tasks is much smaller than that of single-task learning, with high probability, even if the sequence of tasks is generated by an adversary. Benefits of the algorithm are demonstrated in a simulated problem.
- Shie Mannor – Technion
- Title: Risk in RL: Nothing ventured nothing gained
- Abstract: We consider the role risk plays in dynamic decision problems. Different risk conscious criteria such as mean-variance tradeoffs, conditional value at risk, semi-deviation, exponential utility and others, have been studied in the RL/ADP criteria by us and others. We explain the complexity and simulation issues involved in evaluation and optimizing these risk measures. Our main theme is that considering risk is essential to obtain resilience to model uncertainty and even model mismatch. We propose a scheme we call “risk shaping”: an approach to modify the risk criterion to be optimized in such a way that best matches the overall task in hand.
- Csaba Szepesvari – University of Alberta
- Title: Lazy Posterior Sampling for Parametric Nonlinear Control
Tentative Workshop Schedule
Download the EWRL schedule in pdf format.
DAY 1
DAY 2
List of Accepted Contributions
-
Why Multi-objective Reinforcement Learning?
-
A multiplicative UCB strategy for Gamma rewards
-
Contextual Markov Decision Processes
-
Off-policy Model-based Learning under Unknown Factored Dynamics
-
Non-Parametric Policy Learning for High-Dimensional State Representations
-
Deep Sequential Neural Networks
-
Model-Free Preference-based Reinforcement Learning
-
Dueling Bandits as a Partial Monitoring Game
-
An Empirical Evaluation of True-Online TD(lambda)
-
Sample-based abstraction for hybrid relational MDPs
-
Learning to coordinate without communication in multi-user multi-armed bandit problems
-
Parallel Reinforcement Learning with State Action Space Partitioning
-
Policy Gradient for Coherent Risk Measures
-
Generalized Advantage Estimation for Policy Gradients
-
Reinforced Decision Trees
-
Differentially private multi-agent multi-armed bandits
-
-
Multi-Armed Bandit for Pricing
-
Using PCA to Efficiently Represent State Spaces
-
A Reinforcement Learning Approach to Online Learning of Decision Trees
-
PAC Algorithms for the Infinitely-Many Armed Problem with Multiple Pools
-
Emphatic Temporal-Difference Learning
-
Imitation Learning for Accelerating Iterative Computation of Fixed Points
-
Learning Policies for Data Imputation with Guided Policy Search
-
Explore no more: Simple and tight high-probability bounds for non-stochastic bandits
-
On TD(0) with function approximation: Concentration bounds and a centered variant with exponential convergence
Paper Submission
We are calling for papers (and posters) from the entire reinforcement learning spectrum, with the option of either 2 page position papers (on which open discussion will be held) or longer papers up to 8 pages (plus one page with references) in JMLR-format research papers [link]. We encourage a range of submissions to encourage broad discussion.
A selection of accepted papers will appear in the prestigious JMRL W&C proceedings.
Double submissions are allowed (e.g., with ICML). However in the event that an EWRL paper is accepted to another conference proceedings or journal, copyright restrictions prevent it from being reprinted in the JMLR W&C proceedings. The paper would still be considered, however, for acceptance and presentation at EWRL.
- Submission deadline: 01-May-2015 — EXTENSION: 03-MAY, 23:59 Universal Time
- Page limit: 2 pages for position papers and 8 pages plus one page with references for regular papers.
- Paper format: JMLR format
- Submission website: https://easychair.org/conferences/?conf=ewrl122015
- The review process is double-blind
Important Dates
- Paper submissions due: 01-May-2015
- Notification of acceptance: 10-May-2015
- Workshop dates: 10/11-July-2015
Organizing Committee
- Alessandro Lazaric INRIA – Lille, France
- Mohammad Ghavamzadeh Adobe and INRIA – Lille, France
- Remi Munos Google Deep Mind and INRIA – Lille, France
Sponsors
Program Committee
- Yasin Abbasi-Yadkori
- Peter Auer
- Andre Barreto
- Marc Bellemare
- Emma Brunskill
- Christian Daniel
- Marc Deisenroth
- Christos Dimitrakakis
- Amir-Massoud Farahmand
- Victor Gabillon
- Matthieu Geist
- Alborz Geramifard
- Mohammad Ghavamzadeh
- Mohammad Gheshlaghi-Azar
- Matthew Hoffman
- Alessandro Lazaric
- Rupam Mahmood
- Odalric-Ambrym Maillard
- Timothy Mann
- Jeremie Mary
- Remi Munos
- Gergely Neu
- Gerhard Neumann
- Ann Nowe
- Laurent Orseau
- Ronald Ortner
- Simone Parisi
- Olivier Pietquin
- Bilal Piot
- Doina Precup
- Marcello Restelli
- Scott Sanner
- Peter Sunehag
- Csaba Szepesvari
- Georgios Theocharous
- Michal Valko
- Martijn VanOtterlo
- Nikos Vlassis
Registration
Since EWRL-2015 will be organized as a ICML workshop, the ICML-workshop fees have to be paid. EWRL will not have any additional fees.
Workshop Venue



You must be logged in to post a comment.