Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Go's maps under the hood

After reading Dave Cheney's blogpost about Go's maps there is still few things unclear to me.

TLDR:

  • Why are they unordered?
  • Where actual values are stored in memory?

After digging in runtime package I found out that underlying map structure is following:

// A header for a Go map.
type hmap struct {
    // Note: the format of the hmap is also encoded in cmd/compile/internal/gc/reflect.go.
    // Make sure this stays in sync with the compiler's definition.
    count     int // # live cells == size of map.  Must be first (used by len() builtin)
    flags     uint8
    B         uint8  // log_2 of # of buckets (can hold up to loadFactor * 2^B items)
    noverflow uint16 // approximate number of overflow buckets; see incrnoverflow for details
    hash0     uint32 // hash seed

    buckets    unsafe.Pointer // array of 2^B Buckets. may be nil if count==0.
    oldbuckets unsafe.Pointer // previous bucket array of half the size, non-nil only when growing
    nevacuate  uintptr        // progress counter for evacuation (buckets less than this have been evacuated)

    extra *mapextra // optional fields
}

buckets - is array of buckets where indexes is low-order bits of key's hash, where the bucket is:

// A bucket for a Go map.
type bmap struct {
    // tophash generally contains the top byte of the hash value
    // for each key in this bucket. If tophash[0] < minTopHash,
    // tophash[0] is a bucket evacuation state instead.
    tophash [bucketCnt]uint8
    // Followed by bucketCnt keys and then bucketCnt elems.
    // NOTE: packing all the keys together and then all the elems together makes the
    // code a bit more complicated than alternating key/elem/key/elem/... but it allows
    // us to eliminate padding which would be needed for, e.g., map[int64]int8.
    // Followed by an overflow pointer.
}

..well it's just array of uint8 where every item is first byte of key's hash. And key-value pairs are stores as key/key value/value (eight pairs per bucket). But where exactly? Considering that map may contain value of (almost) any type. There should be some kind of pointer to place in memory where array of values stored, but bmap doesn't have such info.

And since key's hashes are located in ordered array inside bucket, why it's order different every time I looping over map?

like image 288
zzell Avatar asked Dec 11 '19 16:12

zzell


People also ask

How are maps implemented in go?

Rather than having, as C++ has, a complete map implementation for each unique map declaration, the Go compiler creates a maptype during compilation and uses that value when calling into the runtime's map functions. Each maptype contains details about properties of this kind of map from key to elem.

Are maps ordered in Golang?

As a Golang map is an unordered collection, it does not preserve the order of keys. We can use additional data structures to iterate over these maps in sorted order.

What are maps in Golang?

Golang Maps is a collection of unordered pairs of key-value. It is widely used because it provides fast lookups and values that can retrieve, update or delete with the help of keys. It is a reference to a hash table.

Is there a map function in Golang?

Map() Function in Golang With Examples. strings. Map() Function in Golang is used to return a copy of the string given string with all its characters modified according to the mapping function.


1 Answers

  • Why are they unordered?

Because this gives greater freedom to the runtime to implement the map type. Although we know Go's (current) implementation is a hashmap, the language specification allows to use any map implementation like hash map, tree map etc. Also not having to remember the order, this allows the runtime to do its job more effectively and using less memory.

Adrian's comment nicely summarizes that order is rarely needed, and it would be a waste to always maintain order. When you do need order, you may use a data structure that provides the ordering. For examples, see Map in order range loop.

And since key's hashes are located in ordered array inside bucket, why it's order different every time I looping over map?

The Go authors intentionally made map's iteration order randomized (so we mortals don't get dependent on a fixed order). For more, see In Golang, why are iterations over maps random?

Also see related: Why can't Go iterate maps in insertion order?

  • Where actual values are stored in memory?

The "where" is specified by hmap.buckets. This is a pointer value, it points to an array in memory, an array holding the buckets.

buckets    unsafe.Pointer // array of 2^B Buckets. may be nil if count==0.

So hmap.buckets points to a contiguous memory segment holding buckets. A bucket is "modeled" by bmap, but this is not its actual memory layout. A bucket starts with an array holding top hash bytes of keys being in the bucket (tophash [bucketCnt]uint8), and this array is followed by bucketCnt keys of the bucket, which is then followed by bucketCnt values of the bucket. Lastly there is an overflow pointer.

Think of the bucket like this conceptual type, which "visualizes" where keys and values are located in memory:

type conceptualBucket struct {
    tophash     [bucketCnt]uint8
    keys        [bucketCnt]keyType
    values      [bucketCnt]valueType
    overflowPtr uintptr
}

Note: bucketCnt is a compile time constant being 8, it is the maximum number of key/elem pairs a bucket can hold.

Of course this "picture" is inaccurate, but it gives the idea where / how keys and values are stored.

like image 69
icza Avatar answered Oct 22 '22 11:10

icza