Neural network q learning for tic tac toe - how to use the threshold

https://datascience.stackexchange.com/questions/26600

31-10-2019
|

Question

I am currently programming a q learning neural network tha does not work. I have previously asked a question about inputs and have sorted that out. My current idea to why the program does not work is to do with the threshold value. this is a neural network - q learning specific variable.

basically the theshold is a value that is between 0 and 1, you then make a random number between 0 and 1, if this random number is larger than the threshold then you pick a completely random choice, otherwise the neural network chooses by finding the largest q value.

My question is that with this threshold value, i am currently implementing it as starting at almost 0, then increasing linearly until it reaches 1 by the time the program has reached the final iteration. Is this correct?

The reason i suspect this is incorrect is that when plotting an error graph from training the neural network, the program doesnt not learn at all, but when the threshold reaches almost 1, it starts to learn very fast, and if you run more iterations after it reaches 1, the all the game sets in the replay memory become the same and the error is basically 0 from their on in.

Any feedback is greatly appreciated and if this question in unclear in anyway just let me know and i will try and fix it. Thank you to anyone who helps out.

No correct solution

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange