I am trying to implement the famous game of Tic Tac Toe using Machine learning with Least Mean Square (LMS) rule (an exercise proposed in Tom Mitchell's famous book, Machine Learning).
I made the computer learn by playing against an optimal opponent that picks the best moves, and then against a randomized player. Against the optimal opponent, my program won about 90% of the games and tied the rest without ever losing. Against a random opponent, it won about 83% and lost 15% of the games.
However, when I played against the program, I won every time using the same strategy.
Here's how my program works:
* create learner and player(randomized or optimal)
* while (game running)
* generate all possible states for a turn and use the best to make the turn
* the best turn is saved
* go through saved boards and calculate value for every feature
* calculate board score using features and current weights
* calculate training score:
* if last board and won: trainings value of last board == 100
* if last board and lost: trainings value of last board: -100
* adjust the weights using LMS rule
I expect this approach to make the computer play perfectly (win most of the time, tie otherwise)? Am I wrong, or is there something wrong with my training method?
Thoughts, ideas, code, suggestions on board features to use on this matter are really appreciated.
I did a similar project to this in my senior year at Lehigh, in 1968-1969, when computers ran on water. ;-) One module was an optimal player, and the second module was the learning machine. In the best situation, the learning machine achieved perfect play in a very short number of training games. To make things more interesting, I also inserted a control to have the "optimal" player make random errors, at a rate I could control. I could then measure the rate of learning vs. the "intelligence" of the training partner (no longer optimal). Of significance, the learning machine still managed to become an optimal player, albeit in a bit longer time. Although a trivial game, when one extends the concept along philosophical lines, it does suggest that computers will eventually become much more highly intelligent than their "teachers."
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With