A Genetic Algorithm for Tic-Tac-Toe

Tags:

genetic-algorithm

So I was assigned the problem of writing a 5x5x5 tic-tac-toe player using a genetic algorithm. My approach was to start off with 3x3, get that working, and then extend to 5x5, and then to 5x5x5.

The way it works is this:

Simulate a whole bunch of games, and during each turn of each game, lookup in a corresponding table (X table or O table implemented as a c++ stdlib maps) for a response. If the board was not there, add the board to the table. Otherwise, make a random response.
After I have complete tables, I initialize a bunch of players (each with a copy of the board table, initialized with random responses), and let them play against each other.
Using their wins/losses to evaluate fitness, I keep a certain % of the best, and they move on. Rinse and repeat for X generations, and an optimal player should emerge.

For 3x3, discounting boards that were reflections/rotations of other boards, and boards where the move is either 'take the win' or 'block the win', the total number of boards I would encounter were either 53 or 38, depending on whether you go first or second. Fantastic! An optimal player was generated in under an hour. Very cool!

Using the same strategy for 5x5, I knew the size of the table would increase, but did not realize it would increase so drastically. Even discounting rotations/reflections and mandatory moves, my table is ~3.6 million entries, with no end in sight.

Okay, so that's clearly not going to work, I need a new plan. What if I don't enumerate all the boards, but just some boards. Well, it seems like this won't work either, because if each player has just a fraction of possible boards they might see, then they are going to be making a lot of random moves, clearly steering in the opposite direction of optimality.

What is a realistic way of going about this? Am I going to be stuck using board features? The goal is to hard-code as little game functionality as possible.

I've been doing research, but everything I read leads to min/max with A-B pruning as the only viable option. I can certainly do it that way, but the GA is really cool, my current method is just exceeding reality a bit here.

EDIT Problem has been pretty much solved:

Using a similarity function that combines hamming distance of open spaces, the possible win conditions, and a few other measures has brought the table down to a very manageable 2500 possibilities, which a std::map handles in a fraction of a second.

456

asked Apr 11 '11 19:04

prelic

2 Answers

My knowledge of GA is pretty limited, but in modeling board configurations, aren't you asking the wrong question? Your task isn't to enumerate all the possible winning configurations -- what you're trying to do is to find a sequence of moves that leads to a winning configuration. Maybe the population you should be looking at isn't a set of boards, but a set of move sequences.

Edit: I wasn't thinking so much of starting from a particular board as starting from an empty board. It's obvious on a 3x3 board that move sequences starting with (1,1) work out best for X. The important thing isn't that the final board has an X in the middle, it's that the X was placed in the middle first. If there's one or more best first moves for X, maybe there's also a best second, third, or fourth move for X, too? After several rounds of fitness testing and recombining, will we find that X's second move is usually the same, or is one of a small set of values? And what about the third move?

This isn't minimax because you're not looking for the best moves one at a time based on the previous state of the board, you're looking for all the best moves at the same time, hoping to converge on a winning strategy.

I know this doesn't solve your problem, but if the idea is to evolve a winning strategy then it seems natural that you'd want to look at sequences of moves rather than board states.

answered Oct 04 '22 20:10

Caleb

This seems to be a very old conversation but attracted my attention. Thinking it might serve the public discussion, here is my input.

I think the aim in your assigned task needs to be defined more clearly:

Are you trying to find a set of winning boards? I don’t think so, because this is very straigtforward for a 3x3 board which can even be solved by hand, and it can be extrapolated to larger boards. GA could be utilized for larger boards, but it would only be a GA exercise.
Are you trying to utilize GA to train TicTacToe to AI players? I think this should be the case. In that case, your GA strings/chromosomes should not represent winning boards, but rather, they should represent ordered move sequences of players, for winning games. This is really a bit trickier to model though, as expected, and it would be a real AI training programming exercise.

I hope this perspective helps.

answered Oct 04 '22 19:10

Pınar Tan

Related questions
                            
                                Computer AI algorithm to write sentences?
                            
                                Artificial Intelligence in Tic-Tac-Toe using C#
                            
                                Programming a chess AI
                            
                                What does dimensionality reduction mean?
                            
                                Game Development & Artificial Intelligence With PHP
                            
                                What would be a good AI strategy to play Gomoku?
                            
                                How to improve accuracy of a FeedForward Neural Network?
                            
                                Computational Complexity of Self-Attention in the Transformer Model
                            
                                Genetic Programming library for Java [closed]
                            
                                What are some good resources on flocking and swarm algorithms?
                            
                                Latest in (open source) chatbot/fake AI?
                            
                                What is the difference between informed and uninformed searches?
                            
                                applying crossover and mutation to a graph (genetic algorithm)
                            
                                Finding meaningful sub-sentences from a sentence
                            
                                What is the theorical foundation for scikit-learn dummy classifier?
                            
                                Giving a neural network "pain"
                            
                                Transposition tables?
                            
                                What language(s) are used to create the Google self driving car software?
                            
                                Weak Classifier
                            
                                Python: Justifying NumPy array

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With