News

1 Paper Accepted to UAI

We are pleased to announce that “Multitask Soft Option Learning” is accepted to UAI 2020! Congratulations to the authors Maximilian Igl, Andrew Gambardella, Jinke He, Nantas Nardelli, N. Siddharth, Wendelin Böhmer, and Shimon Whiteson.The paper can be found here: https://arxiv.org/abs/1904.01033

AAMAS Best Paper Award

“Deep Residual Reinforcement Learning” received the best paper award at AAMAS 2020! Congratulations to the authors Shangtong Zhang, Wendelin Boehmer, and Shimon Whiteson.The paper revisits Baird’s residual algorithm in deep RL. It stabilizes off-policy learning if equipped with bi-directional target net and alleviates distribution mismatch in Dyna planning.Paper: https://arxiv.org/abs/1905.01072Announcement: https://aamas2020.conference.auckland.ac.nz/best-paper-and-demonstration/

2 Papers Accepted to AAMAS

We are happy about two full AAMAS papers with WhiRL members!“Deep Residual Reinforcement Learning”Shangtong Zhang, Wendelin Boehmer, Shimon Whitesonhttps://arxiv.org/abs/1905.01072“Maximizing Information Gain via Prediction Rewards”Yash Satsangi, Sungsu Lim, Shimon Whiteson, Frans Oliehoek, Martha White

2 papers accepted to ICLR

We are very excited about two accepted ICLR 2020 papers and look forward to discussing our work in Ethiopia!“Optimistic Exploration even with a Pessimistic Initialisation” – Tabish Rashid, Bei Peng, Wendelin Boehmer, Shimon Whiteson“VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning” – Luisa Zintgraf, Kyriacos Shiarlis, Maximilian Igl, Sebastian Schulze, Yarin Gal, Katja Hofmann, Shimon WhitesonCongratulations also to new WhiRL member Kristian who is on two accepted papers with his old lab, including his first-author paper“Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery” [...]

8 papers accepted to NeurIPS

We are excited about 8 accepted papers with WhiRL members, and look forward to discussing our work at NeurIPS 2019 in Vancouver!“Generalized Off-Policy Actor-Critic” – Shangtong Zhang, Wendelin Boehmer, Shimon Whiteson (https://arxiv.org/abs/1903.11329)“DAC: The Double Actor-Critic Architecture for Learning Options” – Shangtong Zhang, Shimon Whiteson (https://arxiv.org/abs/1904.12691)“Fast Efficient Hyperparameter Tuning for Policy Gradient Methods” – Supratik Paul, Vitaly Kurin, Shimon Whiteson (https://arxiv.org/abs/1902.06583)“VIREL: A Variational Inference Framework for Reinforcement Learning” – Matthew Fellows, Anuj Mahajan, Tim G. J. Rudner, Shimon Whiteson (Spotlight) (https://arxiv.org/abs/1811.01132)“MAVEN: [...]

IJCAI Survey: Reinforcement Learning Informed by Natural Language

To be successful in real-world tasks, Reinforcement Learning (RL) needs to exploit the compositional, relational, and hierarchical structure of the world, and learn to transfer it to the task at hand. Recent advances in representation learning for language make it possible to build models that acquire world knowledge from text corpora and integrate this knowledge into downstream decision making problems. We thus argue that the time is right to investigate a tight integration of natural language understanding into RL [...]

Vacancy: PostDoc

Together with Katja Hofmann (Microsoft Cambridge), we are hiring a postdoc for a special joint position in WhiRL at the University of Oxford and Microsoft Research Cambridge. More information: http://www.cs.ox.ac.uk/news/1673-full.html

ICML 2019

WhiRL has four accepted papers at ICML this year! Camera ready versions can be found here:“A Baseline for Any Order Gradient Estimation in Stochastic Computation Graphs” – Jingkai Mao‚ Jakob Foerster‚ Tim Rocktäschel‚ Maruan Al−Shedivat‚ Gregory Farquhar and Shimon Whiteson“Fingerprint Policy Optimisation for Robust Reinforcement Learning” – Supratik Paul‚ Michael A. Osborne and Shimon Whiteson“Fast Context Adaptation via Meta−Learning” – Luisa Zintgraf‚ Kyriacos Shiarlis‚ Vitaly Kurin‚ Katja Hofmann and Shimon Whiteson“Bayesian Action Decoder for Deep Multi−Agent Reinforcement Learning” [...]

New blog post: Advice for short-term machine learning research projects

Tim Rocktäschel, Jakob Foerster and Greg Farquhar Every year we get contacted by students who wish to work on short-term machine learning research projects with us. By now, we have supervised a good number of them and we noticed that some of the advice that we gave followed a few recurring principles. In this post, we share what we believe is good advice for a master’s thesis project or a summer research internship in machine learning. This post is by [...]

Five papers accepted at ICML 2018

All our five submissions for ICML 2018 have just been accepted:   DiCE: The Infinitely Differentiable Monte Carlo Estimator Jakob Foerster, Gregory Farquhar, Maruan Al-Shedivat, Tim Rocktäschel, Eric Xing, Shimon Whiteson QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning Tabish Rashid, Mikayel Samvelyan, Christian Schroeder, Gregory Farquhar, Jakob Foerster, Shimon Whiteson Fourier Policy Gradients Matthew Fellows, Kamil Ciosek, Shimon Whiteson Deep Variational Reinforcement Learning for POMDPs Maximilian Igl, Luisa Zintgraf, Tuan Anh Le, Frank Wood, Shimon Whiteson TACO: Learning Task Decomposition via Temporal Alignment for Control Kyriacos Shiarlis, [...]