Monte Carlo Tree Searching UCT implementation

Tags:

Can you explain me how to build the tree?

I quite understood how the nodes are chosen, but a nicer explanation would really help me implementing this algorithm. I already have a board representing the game state, but I don't know (understand) how to generate the tree.

Can someone points me to a well commented implementation of the algorithm (I need to use it for AI)? Or better explanation/examples of it?

I didn't found a lot of resources on the net, this algorithm is rather new...

831

asked Jan 29 '12 20:01

Makers_F

1 Answers

The best way to generate the tree is a series of random playouts. The trick is being able to balance between exploration and exploitation (this is where the UCT comes in). There are some good code samples and plenty of research paper references here : https://web.archive.org/web/20160308043415/http://mcts.ai:80/index.html

When I implemented the algorithm, I used random playouts until I hit an end point or termination state. I had a static evaluation function that would calculate the payoff at this point, then the score from this point is propagated back up the tree. Each player or "team" assumes that the other team will play the best move for themselves, and the worst move possible for their opponent.

I would also recommend checking out the papers by Chaslot and his phd thesis as well as some of the research that references his work (basically all the MCTS work since then).

For example: Player 1's first move could simulate 10 moves into the future alternating between player 1 moves and player 2 moves. Each time you must assume that the opposing player will try to minimize your score whilst maximizing their own score. There is an entire field based on this known as Game Theory. Once you simulate to the end of 10 games, you iterate from the start point again (because there is no point only simulating one set of decisions). Each of these branches of the tree must be scored where the score is propagated up the tree and the score represents the best possible payoff for the player doing the simulating assuming that the other player is also choosing the best moves for themselves.

MCTS consists of four strategic steps, repeated as long as there is time left. The steps are as follows.

In the selection step the tree is traversed from the root node until we reach a node E, where we select a position that is not added to the tree yet.
Next, during the play-out step moves are played in self-play until the end of the game is reached. The result R of this “simulated” game is +1 in case of a win for Black (the first player in LOA), 0 in case of a draw, and −1 in case of a win for White.
Subsequently, in the expansion step children of E are added to the tree.
Finally, R is propagated back along the path from E to the root node in the backpropagation step. When time is up, the move played by the program is the child of the root with the highest value. (This example is taken from this paper - PDF

www.ru.is/faculty/yngvi/pdf/WinandsBS08.pdf

Here are some implementations:

A list of libraries and games using some MCTS implementations http://senseis.xmp.net/?MonteCarloTreeSearch

and a game independent open source UCT MCTS library called Fuego http://fuego.sourceforge.net/fuego-doc-1.1/smartgame-doc/group__sguctgroup.html

199

answered Oct 22 '22 21:10

danielbeard

Related questions
                            
                                302 found response
                            
                                define mark up for generic sphinx admonitions with a specific title
                            
                                bypass ssl certificate validation in subversion
                            
                                Network: Longest prefix matching
                            
                                Can I throw an exception in Javascript, that stops Javascript execution?
                            
                                Can I make the primary constructor private while keeping auxiliary constructors public in Scala?
                            
                                why use @Url.Content
                            
                                Pass arguments to new sinatra app
                            
                                Decrease indentation on a couple of lines
                            
                                Haskell platform: nested functions and optimization
                            
                                What is a good open source alternative to Kannel for SMS gateway? [closed]
                            
                                Templates can be used only with field access, property access, single-dimension array index, or single-parameter custom indexer expressions

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With