I ran into some unexpected behavior from the F# compiler recently. I was able to find a workaround, but the original behavior baffles me and I wanted to see if anyone can help me understand what causes it.
A function that I had defined as non-generic was becoming generic, which interfered with the ability of the function to share state between multiple invocations. I simplified my use-case down to the following:
let nextId =
let mutable i = 0
let help (key:obj) =
i <- i + 1
i
help
nextId "a" // returns 1
nextId "b" // also returns 1!!!!
Why is nextId of type 'a -> int instead of obj -> int? Clearly the generalization is also responsible for the bug where it returns 1 repeatedly, but why is the generalization occurring in the first place?
Note that if I define it without naming the nested function, it works as expected in giving unique Ids:
let nextId =
let mutable i = 0
fun (key:obj) ->
i <- i + 1
i
nextId "a" // returns 1
nextId "b" // returns 2
But even more mysterious, with this definition, F# Interactive can't decide whether nextId is an (obj -> int) or an ('a -> int). When I first define it I get
val nextId : (obj -> int)
but if I simply eval
nextId
I get
val it : ('a -> int)
What's going on here and why does my simple function get automatically generalized?
For each of the following functions, state whether or not they are one-to-one and whether or not they are onto: (a) Let f : Z → Z and f(n) = n3 The function f is one-to-one since n3 = m3 implies n = m. However, it is not onto since the integer 4 (among others) is not in the image of f.
F = force m = mass a = acceleration Newton's Second Law. Here, F is the net force on the mass m. W = mg. W = weight. m = mass.
more ... A special relationship where each input has a single output. It is often written as "f(x)" where x is the input value. Example: f(x) = x/2 ("f of x equals x divided by 2")
For example, when we use the function notation f:R→R, we mean that f is a function from the real numbers to the real numbers. In other words, the domain of f is the set of real number R (and its set of possible outputs or codomain is also the set of real numbers R).
I agree that this is quite unexpected behaviour. I think the reason why F# performs the generalization is that it treats help
(when returning it) as fun x -> help x
. Calling a function that takes obj
seems to be one case where the compiler performs generalization (because it knows that anything can be obj
). The same generalization happens, for example, in:
let foo (o:obj) = 1
let g = fun z -> foo z
Here, g
becomes 'a -> int
too, just like in your first version. I don't quite know why the compiler does this, but what you see can be explained by 1) treating help
as fun x -> help x
and 2) generalising on calls taking obj
.
The other thing that is happening is how F# treats generic values - generic values are generally problematic in ML languages (that's what the whole "value restriction" business is about), but F# allows it in some limited cases - you can for example write:
let empty = []
This defines a generic value of type 'a list
. The caveat is that this gets compiled as a function that is called every time you access the empty
value. I think your first nextId
function gets compiled in the same way - so the body is evaluated each time you access it.
This probably does not answer the why but I hope it provides some more tips on how this is happening - and in what other cases the behaviour you're seeing might be sensible!
I can't tell why the compiler decides to generalize in your first scenario, but ultimately the distinction between nextId
being of type obj -> int
vs 'a -> int
is what drives the seemingly weird behavior here.
For what it's worth, you can "force" the expected behavior in your first scenario with yet another type annotation:
let nextId : obj -> int =
let mutable i = 0
let help (key:obj) =
i <- i + 1
i
help
Now if you put those values in modules (like in this gist), compile and inspect the assembly in ILSpy, you'll find that the code is almost identical except for where the ref cell for the counter is instantiated:
In the concrete case, nextId
is a property yielding a function which is instantiated together with the ref cell in the static initializer for the module, i.e. all calls to nextId
share the same counter,
In the generic case, nextId
is a generic function yielding a function, and the ref cell is instantiated within its body, i.e. you have a counter per a call to nextId
.
So the code emitted in the generic case could actually be rendered in F# with this snippet:
let nextId () =
let mutable i = 0
fun key ->
i <- i + 1
i
The bottom line is that it would make sense to emit a compiler warning when you have a generic value like this. It's easy to avoid the problem once you know it's there, but it's one of those things you won't see coming otherwise.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With