Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MailboxProcessor: Memory leak using return! before receive

Tags:

f#

Given the following agent, which is a simple cache mechanism:

type CacheMsg<'a,'b> = Add of 'a * 'b | ForceFlush

type CacheAgent<'a, 'b when 'a : comparison>(size:int, flushCont:Map<'a, 'b> -> unit) =
    let agent = MailboxProcessor.Start(fun inbox ->
        let rec loop (cache : Map<'a, 'b>) = async {
            let inline flush() =
                flushCont cache
                loop Map.empty

            if cache.Count > size then return! flush()

            let! msg = inbox.Receive()

            match msg with
            | Add (key, value) ->
                if cache.ContainsKey key then
                    return! loop cache
                else return! loop (cache.Add(key, value))
            | ForceFlush -> return! flush() }
        loop Map.empty)

    member x.AddIfNotExists key value = Add(key,value) |> agent.Post
    member x.ForceFlush() = agent.Post ForceFlush

This agent will keep taking up memory (seems like the memory is not freed when the flushCont has been called).

Given the same code, but with a minor change:

type CacheMsg<'a,'b> = Add of 'a * 'b | ForceFlush

type CacheAgent<'a, 'b when 'a : comparison>(size:int, flushCont:Map<'a, 'b> -> unit) =
    let agent = MailboxProcessor.Start(fun inbox ->
        let rec loop (cache : Map<'a, 'b>) = async {
            let inline flush() =
                flushCont cache
                loop Map.empty

            let! msg = inbox.Receive()

            match msg with
            | Add (key, value) ->
                if cache.ContainsKey key then
                    return! loop cache
                else 
                    let newCache = cache.Add(key, value)
                    if newCache.Count > size then 
                        return! flush()
                    else return! loop (cache.Add(key, value))
            | ForceFlush -> return! flush() }
        loop Map.empty)

    member x.AddIfNotExists key value = Add(key,value) |> agent.Post
    member x.ForceFlush() = agent.Post ForceFlush

I have moved the expression that decides when to flush, into the union case Add. This results in the memory is freed as expected.

What's wrong about the first approach, since it leaks memory?

like image 269
ebb Avatar asked May 20 '14 23:05

ebb


1 Answers

The first version isn't tail recursive.

It's not tail recursive, because this expression isn't the last expression in the function:

if cache.Count > size then return! flush()

After that expression, you call

let! msg = inbox.Receive()

so the flush() call isn't the last thing happening. After the recursive call implicit in flush has completed, the execution will need to return to the next expression, where you invoke inbox.Receive(). That means that the context will have to keep the previous invocation on the stack, because the recursion isn't in a tail position: there's still more work to do.

In the second example, all calls to flush and loop are in tail positions.

If you're coming from a C# background, you'd be inclined to think that return! flush() exits the function, but that's not really the case here. The only reason

if cache.Count > size then return! flush()

even compiles without a corresponding else branch is because the expression returns unit. This means that the code inside the then branch doesn't truly exit the function - it just performs the work in the branch (in this case flush()), and then continues executing the subsequent expressions.

like image 177
Mark Seemann Avatar answered Nov 11 '22 13:11

Mark Seemann