I'm writing an AI for a card game and after some testing I've discovered that using MTD(f) on my alpha beta algorithm - a series of zero-window searches - is faster than just using alpha-beta by itself.
The MTD(f) algorithm is described well here http://people.csail.mit.edu/plaat/mtdf.html
The problem I have is that for each pass in the MTD(f) search (for each guess) I don't reuse any of the previous positions I have stored even though the write up on the link suggests that I should (in fact clearing the table between iterations speeds up the algorithm).
My problem is that when I store a position and a value in my transposition table I also store the alpha and beta values for which it is valid. Therefore a second pass through the tree with a different guess (and therefore alpha and beta) can't possibly reuse any information. Is this what is to be expected or am I missing something fundamental here?
For instance if for alpha=3 beta=4 we come to a result of 7 (obviously a cut-off) should I store that in the table as valid for alpha=3 to beta=6? Or beta=7?
A transposition table is a cache of previously seen positions, and associated evaluations, in a game tree generated by a computer game playing program. If a position recurs via a different sequence of moves, the value of the position is retrieved from the table, avoiding re-searching the game tree below that position.
For this reason, chess programs have a transposition table, which is a large hash table storing information about positions previously searched, how deeply they were searched, and what we concluded about them.
Your problem comes down to the conceptual understanding of how to use a transposition table along side an alpha beta search. This was a huge problem I ran into as well, and after looking around I found this discussion which explained the concept to me more naturally than any paper I had read on the topic.
Basically you cannot treat all alpha-beta results the same because when a cutoff occurs, the result only represents a bound, and not the true minimax value. It has been proven that using bounds will still always give you the same best next state, but possibly without having the exact score. When you store the state from a cutoff, you need to treat it as a bound and try to improve upon it on the next pass. This will often evaluate the same node multiple times, but it will continually improve upon the actual score as needed.
Here is a good example of a more complete implementation of the concepts listed in the previously linked article. Scroll to page 14.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With