The notes so far included Bandit Algorithms, MDP, Model-free Methods, Value Function Approximation, Policy Optimization.
Moving on entails a lot of works on reading. Due to courses final and projects, I will have to leave it at that. However, any advice will be appreciated.