I just solved problem 23 on Project Euler, but I noticed a big difference between map[int]bool, and []bool in terms of performance.
I have a function that sums up the proper divisors of a number:
func divisorsSum(n int) int {
sum := 1
for i := 2; i*i <= n; i++ {
if n%i == 0 {
sum += i
if n/i != i {
sum += n / i
}
}
}
return sum
}
And then in main I do like this:
func main() {
start := time.Now()
defer func() {
elapsed := time.Since(start)
fmt.Printf("%s\n", elapsed)
}()
n := 28123
abundant := []int{}
for i := 12; i <= n; i++ {
if divisorsSum(i) > i {
abundant = append(abundant, i)
}
}
sums := map[int]bool{}
for i := 0; i < len(abundant); i++ {
for j := i; j < len(abundant); j++ {
if abundant[i]+abundant[j] > n {
break
}
sums[abundant[i]+abundant[j]] = true
}
}
sum := 0
for i := 1; i <= 28123; i++ {
if _, ok := sums[i]; !ok {
sum += i
}
}
fmt.Println(sum)
}
This code takes 450ms on my computer. But if I change the main code to below with slice of bool instead of map like this:
func main() {
start := time.Now()
defer func() {
elapsed := time.Since(start)
fmt.Printf("%s\n", elapsed)
}()
n := 28123
abundant := []int{}
for i := 12; i <= n; i++ {
if divisorsSum(i) > i {
abundant = append(abundant, i)
}
}
sums := make([]bool, n)
for i := 0; i < len(abundant); i++ {
for j := i; j < len(abundant); j++ {
if abundant[i]+abundant[j] < n {
sums[abundant[i]+abundant[j]] = true
} else {
break
}
}
}
sum := 0
for i := 0; i < len(sums); i++ {
if !sums[i] {
sum += i
}
}
fmt.Println(sum)
}
Now it takes only 40ms, below 1/10 of the speed from previous. I thought maps were supposed to have faster look ups. What is up with the performance difference here?
You can profile your code and see, but in general, there are two main reasons:
You pre-allocate sums
in the second example to its desired size. This means it never has to grow, and all this is very efficient, there's no GC pressure, no reallocs, etc. Try creating the map with the desired size in advance and see how much it improves things.
I don't know the internal implementation of Go's hash map, but in general, random access of an array/slice by integer index is super efficient, and a hash table adds overhead on top of it, especially if it hashes the integers (it might do so to create better distribution).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With