That is clearly impressive as the difference between the two videos is just of a few minutes.
Now, let’s discuss how to do this.
Here is a rundown of how the neural network and the game will interact with each other.
This is how the game and the neural network would actually talk to each other.
Every few frames, the game would send the neural network a message which would contain important information with which the neural network would learn and then use to make a decision and send to the game.
The game would take this decision make the player act accordingly.
Exactly what the neural network and the game will exchange in this communication is not the main concern right now.
We will come to that soon.
For now, think of the neural network and the game as separate entities doing their own thing and talking to each other when required.
TrainingFor the neural network to get better, we need to make it learn.
To do that, we need information on whether the action performed by the neural network was correct or not.
And for this, we will use the score.
The score changes on two events –When the player successfully crosses an enemy with a jump over the enemy and when a player doesn’t jump and stays still on the ground.
The reasoning for the first score increase is clear.
The second one however is a bit different.
You see, if the neural network is not rewarded for staying still, an ideal strategy for it would be to always jump.
Like, seriously always.
And, being the smart neural network it is, it learns to constantly jump.
Therefore, we need to reward it when it doesn’t jump when it doesn’t need to.
This is why we give it a score of +1 when it doesn’t jump.
If it is not perfectly clear now, don’t worry, as we get into the code, it will become much more clear.
One of the key pieces of information that the game would send to neural network is what the last action was and how did it impact the score.
If the score increased, that means the last action was a successful action and the neural network adds it to its training data, if not the neural network ignores the last action.
These concepts will get much easier to understand when you actually look at the code and make changes yourself and see how the neural network behaves.
Finally, a few last concepts, for the first few runs, the neural network will make some random decisions to learn.
Imagine a toddler walking for the first time and hitting a table and then start crying.
The toddler learnt to not walk into the table.
This is what our neural network will do initially.
It will make random jumps and if those jumps don’t work, it will learn from them and it will use this information to make better decisions in the next game.
Again, it gets better when you do it in practice.
From the next post, we will actually delve into the code and see our neural network trying to learn what to do.