Summary of my objective: Figure out how to use continuation-passing style to avoid a stack overflow when using an algorithm I believe cannot be made tail-recursive. Alternatively, find a way to make the function tail-recursive.
Details: I am new to F# (and functional programming in general) and I am attempting to implement the minimax algorithm with alpha-beta pruning. This is an algorithm used to determine the best possible move for a two-player game. The pseudocode for the algorithm can be found here: https://en.wikipedia.org/wiki/Alpha%E2%80%93beta_pruning
This is a resource I've found useful for understanding the flow of the algorithm: http://inst.eecs.berkeley.edu/~cs61b/fa14/ta-materials/apps/ab_tree_practice/
One difference in my implementation is that the players for the game I'm working on do not always alternate. For this reason, I've removed one of the parameters from the function. My implementation is below:
let rec minimax node depth alpha beta =
if depth = 0 || nodeIsTerminal node then
heuristicValue node
else
match node.PlayerToMakeNextChoice with
| PlayerOneMakesNextChoice ->
takeMax (getChildren node) depth alpha beta
| PlayerTwoMakesNextChoice ->
takeMin (getChildren node) depth alpha beta
and takeMax children depth alpha beta =
match children with
| [] -> alpha
| firstChild :: remainingChildren ->
let newAlpha = [alpha; minimax firstChild (depth - 1) alpha beta] |> List.max
if beta < newAlpha then newAlpha
else takeMax remainingChildren depth newAlpha beta
and takeMin children depth alpha beta =
match children with
| [] -> beta
| firstChild :: remainingChildren ->
let newBeta = [beta; minimax firstChild (depth - 1) alpha beta] |> List.min
if newBeta < alpha then newBeta
else takeMax remainingChildren depth alpha newBeta
The problem I'm having is that while takeMax
and takeMin
are tail-recursive, these methods call minimax
when assigning newAlpha
and newBeta
so it is still likely to produce a stack overflow when I call minimax
with a large depth. I have done some research and found that using continuation-passing style is a potential way to use the heap rather than the stack when the function cannot be made tail-recursive (and I believe after many hours of trying that this cannot). While I can understand some very basic examples, I'm having trouble understanding how I would apply the concept to this situation; I'd be very grateful if someone could help walk me through it.
Edit 1: My best understanding of the solution
let minimax node depth alpha beta =
let rec recurse node depth alpha beta k =
if depth = 0 || nodeIsTerminal node then
k (heuristicValue node)
else
match node.PlayerToMakeNextChoice with
| PlayerOneMakesNextChoice ->
takeMax (getChildren node) depth alpha beta k
| PlayerTwoMakesNextChoice ->
takeMin (getChildren node) depth alpha beta k
and takeMax children depth alpha beta k =
match children with
| [] -> k alpha
| firstChild :: remainingChildren ->
let continuation = fun minimaxResult ->
let newAlpha = [alpha; minimaxResult] |> List.max
if beta < newAlpha then k newAlpha
else takeMax remainingChildren depth newAlpha beta k
recurse firstChild (depth - 1) alpha beta continuation
and takeMin children depth alpha beta k =
match children with
| [] -> k beta
| firstChild :: remainingChildren ->
let continuation = fun minimaxResult ->
let newBeta = [beta; minimaxResult] |> List.min
if newBeta < alpha then k newBeta
else takeMax remainingChildren depth alpha newBeta k
recurse firstChild (depth - 1) alpha beta continuation
recurse node depth alpha beta id
As you have undoubtedly seen in the "basic examples", the general idea is to take one extra parameter (the "continuation", usually denoted k
), which is a function, and every time you would return a value, pass that value to the continuation instead. So, for example, to modify minimax
this way, we'd get:
let rec minimax node depth alpha beta k =
if depth = 0 || nodeIsTerminal node then
k (heuristicValue node)
else
match node.PlayerToMakeNextChoice with
| PlayerOneMakesNextChoice ->
k (takeMax (getChildren node) depth alpha beta)
| PlayerTwoMakesNextChoice ->
k (takeMin (getChildren node) depth alpha beta)
And then the call site needs to be "turned inside out", so to speak, so instead of something like this:
let a = minimax ...
let b = f a
let c = g b
c
we would write something like this:
minimax ... (fun a ->
let b = f a
let c = g b
c
)
See? a
used to be a return value of minimax
, but now a
is the parameter of the continuation that is passed to minimax
. The runtime mechanics is that, once minimax
has finished running, it would call this continuation, and its result value would show up there as parameter a
.
So, to apply this to your real code, we'd get this:
| firstChild :: remainingChildren ->
minimax firstChild (depth - 1) alpha beta (fun minimaxResult ->
let newAlpha = [alpha; minimaxResult] |> List.max
if beta < newAlpha then newAlpha
else takeMax remainingChildren depth newAlpha beta
)
Ok, this is all well and good, but this is only half the job: we have rewritten minimax
in CPS, but takeMin
and takeMax
are still recursive. Not good.
So let's do takeMax
first. Same idea: add an extra parameter k
and every time we would "return" a value, pass it to k
instead:
and takeMax children depth alpha beta k =
match children with
| [] -> k alpha
| firstChild :: remainingChildren ->
minimax firstChild (depth - 1) alpha beta (fun minimaxResult ->
let newAlpha = [alpha; minimaxResult] |> List.max
if beta < newAlpha then k newAlpha
else takeMax remainingChildren depth newAlpha beta k
)
And now, of course, I have to modify the call site correspondingly:
let minimax ... k =
...
match node.PlayerToMakeNextChoice with
| PlayerOneMakesNextChoice ->
takeMax (getChildren node) depth alpha beta k
Wait, what just happened? I just said that every time I return a value I should pass it to k
instead, but here I'm not doing it. Instead, I'm passing k
itself to takeMax
. Huh?
Well, the rule that "instead of returning pass to k
" is only the first part of the approach. The second part is - "on every recursive call, pass k
down the chain". This way, the original top-level k
would travel down the whole recursive chain of calls, and be ultimately called by whatever function decides to stop the recursion.
Keep in mind that, although CPS helps with stack overflow, it does not free you from memory limits in general. All those intermediate values don't go on the stack anymore, but they have to go somewhere. In this case, every time we construct that lambda fun minimaxResult -> ...
, that's a heap allocation. So all your intermediate values go on the heap.
There is a nice symmetry though: if the algorithm was truly tail-recursive, you'd be able to pass the original top-level continuation down the call chain without allocating any intermediate lambdas, and so you wouldn't require any heap memory.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With