Monte Carlo Tree Search: Tree Policy for two player games

Question

I am a little confused about how the MCTS "Tree Policy" is implemented. Every paper or article I read talks about going down the tree from the current game state(in MCTS teminology: the root for the player about to make a move). My question is how am I selecting the best child even when I am at the MIN player level ( assuming I am the MAX player). Even if I select some particular action that MIN might take, and my search tree gets deeper through that node, the MIN player during its turn might just as well choose some different node.( If the min player is a amateur human it might just as well choose some node which is not necessarily the best). This kind of makes MAX's entire work in propagating through that node futile since the MIN has chosen a different node. For the steps I am referring to : https://jeffbradberry.com/posts/2015/09/intro-to-monte-carlo-tree-search/ where the tree policy : https://jeffbradberry.com/images/mcts_selection.png kind of makes me believe that they are executing it from a single player perspective.

Cash Lo · Accepted Answer

To implement MCTS for two player game, you can simply flip the sign in every step of back-propagation, a one-line change in the code.

This means we are trying to maximize reward in every layer, but when we propagate the reward up the tree the positive reward for your opponent become negative for you when you get to your layer.

Monte Carlo Tree Search: Tree Policy for two player games

Tags:

algorithm

montecarlo

alpha-beta-pruning

CS101

1 Answers

Cash Lo

Recent Activity

Donate For Us

Monte Carlo Tree Search: Tree Policy for two player games

Tags:

algorithm

montecarlo

alpha-beta-pruning

CS101

1 Answers

Cash Lo

Related questions

Recent Activity

Donate For Us