11. MDPs & Q-Learning

Markov Decision Process 
State Transistion Matrix 
Markov Process 
Markov Reward Process 
Return  
Value Function 
Polices 
Lösung 
Lösungsmethoden 
Batch Learning 
Q-Learning 
$\alpha$