Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Monte Carlo Tree Search: Tree Policy for two player games

I am a little confused about how the MCTS "Tree Policy" is implemented. Every paper or article I read talks about going down the tree from the current game state(in MCTS teminology: the root for the player about to make a move). My question is how am I selecting the best child even when I am at the MIN player level ( assuming I am the MAX player). Even if I select some particular action that MIN might take, and my search tree gets deeper through that node, the MIN player during its turn might just as well choose some different node.( If the min player is a amateur human it might just as well choose some node which is not necessarily the best). This kind of makes MAX's entire work in propagating through that node futile since the MIN has chosen a different node. For the steps I am referring to : https://jeffbradberry.com/posts/2015/09/intro-to-monte-carlo-tree-search/ where the tree policy : https://jeffbradberry.com/images/mcts_selection.png kind of makes me believe that they are executing it from a single player perspective.

like image 977
CS101 Avatar asked Feb 17 '17 15:02

CS101


1 Answers

To implement MCTS for two player game, you can simply flip the sign in every step of back-propagation, a one-line change in the code.

This means we are trying to maximize reward in every layer, but when we propagate the reward up the tree the positive reward for your opponent become negative for you when you get to your layer.

like image 187
Cash Lo Avatar answered Oct 06 '22 19:10

Cash Lo