Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using F#'s hash function inside GetHashCode() evil?

I encountered a couple of places online where code looked something like this:

[<CustomEquality;NoComparison>]
type Test =
    | Foo
    | Bar
    override x.Equals y = 
        match y with
        | :? Test as y' ->
            match y' with
            | Foo -> false
            | Bar -> true    // silly, I know, but not the question here
        | _ -> failwith "error"   // don't do this at home

    override x.GetHashCode() = hash x

But when I run the above in FSI, the prompt does not return when I either call hash foo on an instance of Test or when I call foo.GetHashCode() directly.

let foo = Test.Foo;;
hash foo;;   // no returning to the console until Ctrl-break
foo.GetHashCode();;  // no return

I couldn't readily proof it, but it suggests that hash x calls GetHashCode() on the object, which means the above code is dangerous. Or is it just FSI playing up?

I thought code like the above just means "please implement custom equality, but leave the hash function as default".

I have meanwhile implemented this pattern differently, but am still wondering whether I am correct in assuming that hash just calls GetHashCode(), leading to an eternal loop.


As an aside, using equality inside FSI returns immediately, suggesting that it either does not call GetHashCode() prior to comparison, or it does something else. Update: this makes sense as in the example above x.Equals does not call GetHashCode(), and the equality operator calls into Equals, not into GetHashCode().

like image 968
Abel Avatar asked Dec 24 '22 01:12

Abel


2 Answers

It's not quite as simple as the hash function simply being a wrapper for GetHashCode but I can comfortably tell you that it's definitely not safe to use the implementation : override x.GetHashCode() = hash x.

If you trace the hash function through, you end up here:

let rec GenericHashParamObj (iec : System.Collections.IEqualityComparer) (x: obj) : int =
    match x with 
    | null -> 0 
    | (:? System.Array as a) -> 
        match a with 
        | :? (obj[]) as oa -> GenericHashObjArray iec oa 
        | :? (byte[]) as ba -> GenericHashByteArray ba 
        | :? (int[]) as ba -> GenericHashInt32Array ba 
        | :? (int64[]) as ba -> GenericHashInt64Array ba 
        | _ -> GenericHashArbArray iec a 
    | :? IStructuralEquatable as a ->    
        a.GetHashCode(iec)
    | _ -> 
        x.GetHashCode()

You can see here that the wild-card case calls x.GetHashCode(), hence it's very possible to find yourself in an infinite recursion.

The only case I can see where you might want to use hash inside an implementation of GetHashCode() would be when you are manually hashing some of an object's members to produce a hash code.

There is a (very old) example of using hash inside GetHashCode() in this way in Don Syme's WebLog.


By the way, that's not the only thing unsafe about the code you posted.

Overrides for object.Equals absolutely must not throw exceptions. If the types do not match, they are to return false. This is clearly documented in System.Object.

Implementations of Equals must not throw exceptions; they should always return a value. For example, if obj is null, the Equals method should return false instead of throwing an ArgumentNullException.

(Source)

like image 184
TheInnerLight Avatar answered Dec 28 '22 07:12

TheInnerLight


If the GetHashCode() method is overridden, then the hash operator will use that:

[The hash operator is a] generic hash function, designed to return equal hash values for items that are equal according to the = operator. By default it will use structural hashing for F# union, record and tuple types, hashing the complete contents of the type. The exact behavior of the function can be adjusted on a type-by-type basis by implementing System.Object.GetHashCode for each type.

So yes, this is a bad idea and it makes sense that it would lead to an infinite loop.

like image 45
JLRishe Avatar answered Dec 28 '22 08:12

JLRishe