Policy of Q-learning agent

I love Reinforcement Learning and I also love math but I know the first steps into RL can be pretty intimidating. So I began working on something helping to understand the basics of RL without the need of large amounts of math. Furthermore I kept the code “lean” in a notebook style. Of course there’s some amount a hardcoding and code repetition but I think this way it’s easier to understand the algorithm behind the code.

It’s definitely not finished but your input and opinions are very welcome.

You can try it on mybinder and execute the code right in your browser (sometimes down):

http://mybinder.org/repo/davidsanwald/ai-notebook

Static version:

http://nbviewer.jupyter.org/github/DavidSanwald/ai-notebook/blob/master/index.ipynb

Or download on github:

https://github.com/DavidSanwald/ai-notebook