With a coworker we were discussing the performance of using a map as a list and comparing the use of interface as a value(map[int]interface{}
) vs the empty struct(map[int]struct{})
and we got some weird values on the benchmark tests.
Code
package main
func main() {}
func MapWithInterface() {
m := map[int]interface{}{}
for i := 1; i <= 100; i++ {
m[i] = nil
}
}
func MapWithEmptyStruct() {
m := map[int]struct{}{}
for i := 1; i <= 100; i++ {
m[i] = struct{}{}
}
}
Tests
package main
import "testing"
func Benchmark_Interface(b *testing.B) {
for i := 0; i < b.N; i++ {
MapWithInterface()
}
}
func Benchmark_EmptyStruct(b *testing.B) {
for i := 0; i < b.N; i++ {
MapWithEmptyStruct()
}
}
Results
go version go1.15.5 darwin/amd64
go test -bench=. -benchmem
goos: darwin
goarch: amd64
pkg: awesomeProject1
Benchmark_Interface-8 130419 8949 ns/op 7824 B/op 7 allocs/op
Benchmark_EmptyStruct-8 165147 6964 ns/op 3070 B/op 17 allocs/op
PASS
ok awesomeProject1 3.122s
Question
So, it appears that the empty struct version run faster and use less memory, but does more allocations and can't figure it out why. Does anyone know why this happens?
A map of type map[string]int with an initial capacity of 100 now requires 4176 bytes (on 64-bit arch). The default initial capacity is around 7 if not specified explicitly.
To delete a key from a map, we can use Go's built-in delete function. It should be noted that when we delete a key from a map, its value will also be deleted as the key-value pair is like a single entity when it comes to maps in Go.
I created a repository that contains some tests to help in the understanding of this answer.
The internal design of maps in Golang is highly optimized for performance and memory management. Maps keep track of keys and values that can hold pointers. If the entries in a bucket can't hold pointers, maps just create overflow buckets to avoid unnecessary overhead with GC, which results in more allocations (the case of map[int]struct{}
).
We need to understand map initialization and map structure. Then, we will analyze the benchmarks.
There are two methods for map initialization:
make(map[int]string)
when we have no idea about how many entries will be added.make(map[int]string, hint)
when we have an idea about how many entries will be added. hint
is an estimate of the initial capacity.Maps are mutable and they will grow on-demand, no matter which initialization method we choose. However, the second method pre-allocates memory for at least hint
entries, which results in increased performance.
A map in Go is a hash table that stores its key/value pairs into buckets. Each bucket is an array that holds up to 8 entries. The default number of buckets is 1. Once the number of entries across each bucket reaches an average load of buckets (aka load factor), the map gets bigger by doubling the number of buckets. Every time a map grows, it allocates memory for the new-coming entries. In practice, every time the load of the buckets reaches 6.5 or more, the map grows.
Behind the scenes, a map is a pointer to the hmap
struct. There is also the maptype
struct, which holds some information about the type of the map
. The source code for map can be found here:
https://github.com/golang/go/blob/master/src/runtime/map.go
Below you can find some insights on how to hack the map
type and how to see a map growing:
One important thing to note is that maps keep track of keys and values that can hold pointers. If the entries in a bucket can't hold any pointers, the bucket is marked as containing no pointers and maps just create overflow buckets (which means more memory allocations). This avoids unnecessary overhead with GC. See this comment in the mapextra
struct (line 132) and this post for reference.
An empty struct struct{}
has no fields and cannot hold any pointers. As a result, the buckets in the empty struct case will be marked as containing no pointers and we can expect more memory allocations for a map of type map[int]struct{}
as it grows. On the other hand, interface{}
can hold any values, including pointers. Map buckets keep track of the size of memory prefix holding pointers (ptrdata
field, line 33) to decide if more overflow buckets will be created (map.go, line 265). Refer to this link to see the size of memory prefix holding all pointers for map[int]struct{}
and map[int]interface{}
.
The difference between the two benchmarks (Benchmark_EmptyStruct
and Benchmark_Interface
) is clear when we see the CPU profile. Benchmark_Interface
does not have the (*hmap) createOverflow
method that results in an additional memory allocation flow:
Benchmark_EmptyStruct CPU profile [png, svg]
Benchmark_Interface CPU profile [png, svg]
I customized the tests to pass the number of entries and the map's initial capacity (hint). Here are the results of the executions. Results are basically the same when there are few entries or when the initial capacity is greater than the number of entries. If you have many entries and an initial capacity of 0, you will get quite a different number for allocations.
Benchmark | Entries | InitialCapacity | Speed | Bytes/op | Allocations/op |
---|---|---|---|---|---|
EmptyStruct | 7 | 0 | 115 ns/op | 0 B/op | 0 allocs/op |
Interface | 7 | 0 | 94.8 ns/op | 0 B/op | 0 allocs/op |
EmptyStruct | 8 | 0 | 114 ns/op | 0 B/op | 0 allocs/op |
Interface | 8 | 0 | 110 ns/op | 0 B/op | 0 allocs/op |
EmptyStruct | 9 | 0 | 339 ns/op | 160 B/op | 1 allocs/op |
Interface | 9 | 0 | 439 ns/op | 416 B/op | 1 allocs/op |
EmptyStruct | 16 | 16 | 444 ns/op | 324 B/op | 1 allocs/op |
Interface | 16 | 16 | 586 ns/op | 902 B/op | 1 allocs/op |
EmptyStruct | 16 | 32 | 448 ns/op | 640 B/op | 1 allocs/op |
Interface | 16 | 32 | 724 ns/op | 1792 B/op | 1 allocs/op |
EmptyStruct | 16 | 100 | 634 ns/op | 1440 B/op | 2 allocs/op |
Interface | 16 | 100 | 1241 ns/op | 4128 B/op | 2 allocs/op |
EmptyStruct | 100 | 0 | 5339 ns/op | 3071 B/op | 17 allocs/op |
Interface | 100 | 0 | 6524 ns/op | 7824 B/op | 7 allocs/op |
EmptyStruct | 100 | 128 | 2665 ns/op | 3109 B/op | 2 allocs/op |
Interface | 100 | 128 | 3938 ns/op | 8224 B/op | 2 allocs/op |
There's nothing wrong with the benchmarking methodology. It's all related to map optimization for performance and memory management. The benchmarks show the average value per iteration. Maps of type map[int]interface{}
are slower because they suffer performance degradation when GC scans the buckets that can hold pointers. Maps of type map[int]struct{}
use less memory because they, in fact, use less memory (Test_EmptyStructValueSize
shows that struct{}{}
has zero size). Despite nil
being the zero value for interface{}
, this type requires some space to store ANY value (Test_NilInterfaceValueSize
test shows the size of an interface{}
holding a nil
value is not zero). Finally, the empty struct benchmark allocations are higher because the type map[int]struct{}
requires more overflow buckets (for performance optimization) since its buckets don't hold any pointers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With