Code

Official code repositories (WhiRL lab)

Benchmark: SMAC: StarCraft Multi-Agent Challenge

A benchmark for multi-agent reinforcement learning research based on StarCraft II game.

Code: https://github.com/oxwhirl/smac.

Framework: PyMARL

A framework for research in deep multi-agent reinforcement learning with implementations of state-of-the-art algorithms, such as QMIX and COMA.

Code: https://github.com/oxwhirl/pymarl


Paper: “The StarCraft Multi-Agent Challenge”
, Mikayel Samvelyan, Tabish Rashid, Christian Schroeder, Gregory Farquhar, Nantas Nardelli, Tim Rudner, Chia-Man Hung, Philiph H.S. Torr, Jakob Foerster, Shimon Whiteson.

Paper: https://arxiv.org/abs/1902.04043
Blogpost: http://whirl.cs.ox.ac.uk/blog/smac/

Code:
https://github.com/oxwhirl/smac
https://github.com/oxwhirl/pymarl

Paper: “Deep Variational Reinforcement Learning for POMDPs” – Maximilian Igl, Luisa Zintgraf, Tuan Anh Le, Frank Wood, Shimon Whiteson

Code: https://github.com/maximilianigl/DVRL

Paper: “DiCE: The Infinitely Differentiable Monte-Carlo Estimator” – Jakob Foerster, Gregory Farquhar, Maruan Al-Shedivat, Tim Rocktäschel, Eric P. Xing, Shimon Whiteson

Code: https://github.com/alshedivat/lola

Tensorflow Example: https://goo.gl/xkkGxN (https://drive.google.com/drive/folders/1qjuLTdRbM5CoyNGEyaCJdFKJ9UEwhU28)

Available in Pyro (https://github.com/uber/pyro)

Available in Tensorflow: https://github.com/tensorflow/probability/blob/master/tensorflow_probability/python/monte_carlo.py

Paper: “Learning with Opponent-Learning Awareness” – Jakob N. Foerster, Richard Y. Chen, Maruan Al-Shedivat, Shimon Whiteson, Pieter Abbeel, Igor Mordatch

Code: https://github.com/alshedivat/lola

Paper: “TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning”Gregory Farquhar, Tim Rocktäschel, Maximilian Igl, Shimon Whiteson [https://arxiv.org/abs/1710.11417]

Code: https://github.com/oxwhirl/treeqn

Paper: “Learning to Communicate with Deep Multi-Agent Reinforcement Learning” – Jakob N. Foerster, Yannis M. Assael, Nando de Freitas, Shimon Whiteson

Code: https://github.com/iassael/learning-to-communicate

Community reimplementations (untested by WhiRL)

Paper: “DiCE: The Infinitely Differentiable Monte-Carlo Estimator” – Jakob Foerster, Gregory Farquhar, Maruan Al-Shedivat, Tim Rocktäschel, Eric P. Xing, Shimon Whiteson

Code: https://github.com/alexis-jacq/LOLA_DiCE