Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unicity of complex key dictionaries in Go but not in Julia?

In GO when I use a struct as a key for a map, there is an unicity of the keys.

For example, the following code produce a map with only one key : map[{x 1}:1]

package main

import (
    "fmt"
)

type MyT struct {
    A string
    B int
}

func main() {

    dic := make(map[MyT]int)

    for i := 1; i <= 10; i++ {
        dic[MyT{"x", 1}] = 1
    }

    fmt.Println(dic)
}

// result : map[{x 1}:1]

I Tried to do the same in Julia and I had a strange surprise :

This Julia code, similar to the GO one, produces a dictionary whith 10 keys !

    type MyT
        A::String
        B::Int64
    end

    dic = Dict{MyT, Int64}()

    for i in 1:10
        dic[MyT("x", 1)] = 1
    end

    println(dic)
    # Dict(MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1)

    println(keys(dic))
    # MyT[MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1)]

So what I did wrong ?

Thank you @DanGetz for the solution ! :

immutable MyT     # or struct MyT with julia > 0.6
    A::String
    B::Int64
end

dic = Dict{MyT, Int64}()

for i in 1:10
    dic[MyT("x", 1)] = 1
end

println(dic)         # Dict(MyT("x", 1)=>1)
println(keys(dic))   # MyT[MyT("x", 1)]
like image 394
Fred Avatar asked Jun 01 '17 13:06

Fred


People also ask

How do you check if a key is in a dictionary Julia?

To test for the presence of a key in a dictionary, use haskey or k in keys(dict) . Base. eltype — Function.

Are Dictionaries mutable in Julia?

A Dictionary in Julia is a collection of key-value pairs, which provide much more flexibility than arrays or named tuples. In particular, Dictionaries are mutable, and the keys can be of any type (where in Arrays they have to be Integers, and in Named Tuples, symbols).

Are Dictionaries slow in Julia?

You could write a caching wrapper. Compared to python, dictionary lookup is very fast. This is ridiculously slow, because it involves a non-inlined runtime call to jl_object_id . At some point in the future it will probably get faster (someone writes an if @generated fallback that treats objects as blobs of memory).

How do you use Dict in Julia?

A dictionary in Julia can be created with a pre-defined keyword Dict(). This keyword accepts key-value pairs as arguments and generates a dictionary by defining its data type based on the data type of the key-value pairs. One can also pre-define the data type of the dictionary if the data type of the values is known.


1 Answers

Mutable values hash by identity in Julia, since without additional knowledge about what a type represents, one cannot know if two values with the same structure mean the same thing or not. Hashing mutable objects by value can be especially problematic if you mutate a value after using it as a dictionary key – this is not a problem when hashing by identity since the identity of a mutable object remains the same even when it is modified. On the other hand, it's perfectly safe to hash immutable objects by value – since they cannot be mutated, and accordingly that is the default behavior for immutable types. In the given example, if you make MyT immutable you will automatically get the behavior you're expecting:

immutable MyT # `struct MyT` in 0.6
    A::String
    B::Int64
end

dic = Dict{MyT, Int64}()

for i in 1:10
    dic[MyT("x", 1)] = 1
end

julia> dic
Dict{MyT,Int64} with 1 entry:
  MyT("x", 1) => 1

julia> keys(dic)
Base.KeyIterator for a Dict{MyT,Int64} with 1 entry. Keys:
  MyT("x", 1)

For a type holding a String and an Int value that you want to use as a hash key, immutability is probably the right choice. In fact, immutability is the right choice more often than not, which is why the keywords introducing structural types has been change in 0.6 to struct for immutable structures and mutable struct for mutable structures – on the principle that people will reach for the shorter, simpler name first, so that should be the better default choice – i.e. immutability.

As @ntdef has written, you can change the hashing behavior of your type by overloading the Base.hash function. However, his definition is incorrect in a few respects (which is probably our fault for failing to document this more prominently and thoroughly):

  1. The method signature of Base.hash that you want to overload is Base.hash(::T, ::UInt).
  2. The Base.hash(::T, ::UInt) method must return a UInt value.
  3. If you are overloading Base.hash, you should also overload Base.== to match.

So this would be a correct way to make your mutable type hash by value (new Julia session required to redefine MyT):

type MyT # `mutable struct MyT` in 0.6
    A::String
    B::Int64
end

import Base: ==, hash

==(x::MyT, y::MyT) = x.A == y.A && x.B == y.B

hash(x::MyT, h::UInt) = hash((MyT, x.A, x.B), h)

dic = Dict{MyT, Int64}()

for i in 1:10
    dic[MyT("x", 1)] = 1
end

julia> dic
Dict{MyT,Int64} with 1 entry:
  MyT("x", 1) => 1

julia> keys(dic)
Base.KeyIterator for a Dict{MyT,Int64} with 1 entry. Keys:
  MyT("x", 1)

This is kind of annoying to do manually, but the AutoHashEquals package automates this, taking the tedium out of it. All you need to do is prefix the type definition with the @auto_hash_equals macro:

using AutoHashEquals

@auto_hash_equals type MyT # `@auto_hash_equals mutable struct MyT` in 0.6
    A::String
    B::Int64
end

Bottom line:

  • If you have a type that should have value-based equality and hashing, seriously consider making it immutable.

  • If your type really has to be mutable, then think hard about whether it's a good idea to use as a hash key.

  • If you really need to use a mutable type as a hash key with value-based equality and hashing semantics, use the AutoHashEquals package.

like image 106
StefanKarpinski Avatar answered Nov 02 '22 23:11

StefanKarpinski