Statistics and Machine Learning Group
@ Indian Institute of Science, Bangalore

The below videos demonstrate our work on the problem of autonomous navigation using Reinforcement Learning (RL). The agent is trained using the novel Sequential Soft Option Critic (SSOC) algorithm. The agent is learning new skills to navigate using policy in the options framework. Here new policy is learned in sequential fashion on top of previously learned policies.

The video represents the navigation of the agent in a complex Duckie Town environment using two policies. Here the observations are quite complex due to various objects in the environment. Still, the agent is able to navigate well and also able to take two immediate turns only with few number of policies.
The video represents the navigation in our 3D color environment. Here if the wall is colored green agent is required to turn in colored wall direction. If the wall is colored red, agent is required to take turn in the direction opposite to the colored wall. We can see that agent is able to understand this color coding well and navigating in the environment accordingly.
© Statistics and Machine Learning Group, Dept. of Computer Science & Automation, Indian Institute of Science, Bangalore, 2013