Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is F# compiler sometimes incorrectly generalizing functions?

I ran into some unexpected behavior from the F# compiler recently. I was able to find a workaround, but the original behavior baffles me and I wanted to see if anyone can help me understand what causes it.

A function that I had defined as non-generic was becoming generic, which interfered with the ability of the function to share state between multiple invocations. I simplified my use-case down to the following:

let nextId =
  let mutable i = 0
  let help (key:obj) =
    i <- i + 1
    i
  help
nextId "a" // returns 1
nextId "b" // also returns 1!!!!

Why is nextId of type 'a -> int instead of obj -> int? Clearly the generalization is also responsible for the bug where it returns 1 repeatedly, but why is the generalization occurring in the first place?

Note that if I define it without naming the nested function, it works as expected in giving unique Ids:

let nextId =
  let mutable i = 0
  fun (key:obj) ->
    i <- i + 1
    i
nextId "a" // returns 1
nextId "b" // returns 2

But even more mysterious, with this definition, F# Interactive can't decide whether nextId is an (obj -> int) or an ('a -> int). When I first define it I get

val nextId : (obj -> int)

but if I simply eval

nextId

I get

val it : ('a -> int)

What's going on here and why does my simple function get automatically generalized?

like image 829
Maximilian Wilson Avatar asked Jul 03 '17 20:07

Maximilian Wilson


People also ask

Is f'n )= n 3 onto?

For each of the following functions, state whether or not they are one-to-one and whether or not they are onto: (a) Let f : Z → Z and f(n) = n3 The function f is one-to-one since n3 = m3 implies n = m. However, it is not onto since the integer 4 (among others) is not in the image of f.

What does f mean in physics?

F = force m = mass a = acceleration Newton's Second Law. Here, F is the net force on the mass m. W = mg. W = weight. m = mass.

What does f in math stand for?

more ... A special relationship where each input has a single output. It is often written as "f(x)" where x is the input value. Example: f(x) = x/2 ("f of x equals x divided by 2")

What does f R -> R mean?

For example, when we use the function notation f:R→R, we mean that f is a function from the real numbers to the real numbers. In other words, the domain of f is the set of real number R (and its set of possible outputs or codomain is also the set of real numbers R).


2 Answers

I agree that this is quite unexpected behaviour. I think the reason why F# performs the generalization is that it treats help (when returning it) as fun x -> help x. Calling a function that takes obj seems to be one case where the compiler performs generalization (because it knows that anything can be obj). The same generalization happens, for example, in:

let foo (o:obj) = 1
let g = fun z -> foo z

Here, g becomes 'a -> int too, just like in your first version. I don't quite know why the compiler does this, but what you see can be explained by 1) treating help as fun x -> help x and 2) generalising on calls taking obj.

The other thing that is happening is how F# treats generic values - generic values are generally problematic in ML languages (that's what the whole "value restriction" business is about), but F# allows it in some limited cases - you can for example write:

let empty = []

This defines a generic value of type 'a list. The caveat is that this gets compiled as a function that is called every time you access the empty value. I think your first nextId function gets compiled in the same way - so the body is evaluated each time you access it.

This probably does not answer the why but I hope it provides some more tips on how this is happening - and in what other cases the behaviour you're seeing might be sensible!

like image 50
Tomas Petricek Avatar answered Oct 21 '22 15:10

Tomas Petricek


I can't tell why the compiler decides to generalize in your first scenario, but ultimately the distinction between nextId being of type obj -> int vs 'a -> int is what drives the seemingly weird behavior here.

For what it's worth, you can "force" the expected behavior in your first scenario with yet another type annotation:

let nextId : obj -> int =
    let mutable i = 0
    let help (key:obj) =
        i <- i + 1
        i
    help

Now if you put those values in modules (like in this gist), compile and inspect the assembly in ILSpy, you'll find that the code is almost identical except for where the ref cell for the counter is instantiated:

  • In the concrete case, nextId is a property yielding a function which is instantiated together with the ref cell in the static initializer for the module, i.e. all calls to nextId share the same counter,

  • In the generic case, nextId is a generic function yielding a function, and the ref cell is instantiated within its body, i.e. you have a counter per a call to nextId.

So the code emitted in the generic case could actually be rendered in F# with this snippet:

let nextId () =
    let mutable i = 0
    fun key ->
        i <- i + 1
        i

The bottom line is that it would make sense to emit a compiler warning when you have a generic value like this. It's easy to avoid the problem once you know it's there, but it's one of those things you won't see coming otherwise.

like image 5
scrwtp Avatar answered Oct 21 '22 16:10

scrwtp