Wednesday, November 23, 2016

Hierarchical Reinforcement Learning: A Literature Summary

This is a quick summary of current work on hierarchical reinforcement learning (RL) aimed at students choosing to do hierarchical RL projects under my supervision.

The most common formalisation of hierarchical RL in terms of semi-MDPs was given by Sutton, Precup and Singh

In 2015, Pierre-Luc Bacon, Jean Harb and Doina Precup published an article entitled 'The Option-Critic Architecture', describing an algorithm for automatically sub-dividing ans solving an RL problem.