Reinforcement Learning

Bellman 
Policy optimal 
policy improve 
exploration / exploitation 
exploration-exploitation trade-off  
Q-Learning