drawing

I am a DPhil student at University of Oxford. The goal of my research is to solve sequential decision making problems in a scalable and reliable way. Currently, I focus on off-policy and offline reinforcement learning as a solution method. My work won the best paper award at AAMAS and my research is funded by an EPSRC studentship. I spent some time at Microsoft Research and DeepMind during my DPhil. For more information please visit https://shangtongzhang.github.io/