If I have a struct like:
type Foo struct {
title string
Tags map[string]string
}
How might approach maintaining a unique set of such structs? From what I understand, although struct equality is a thing - map equality isn't. This means I can't compare my above structs. Therefore I can't just implement the map as set pattern.
The two options that might work I can think of are: convert the Tags to a sorted [][]string
or use reflect.Deepequal. Anyone have a better idea?
There are a few ways of implementing this. James Henstridge actually had a good idea, and I attempted to implement it. It performed pretty poorly just to use map in the first place, without my own hash algorithm.
The way I solved this problem is just keep an array of your structs and then remove any duplicates as you insert them.
package structset
type Foo struct {
title string
Tags map[string]string
}
func (f Foo) Equals(f2 Foo) bool {
if f.title != f2.title {
return false
}
if len(f.Tags) != len(f2.Tags) {
return false
}
for k, v := range f.Tags {
if w, ok := f2.Tags[k]; !ok || v != w {
return false
}
}
return true
}
type FooSet []Foo
func (this FooSet) Add(value Foo) {
if !this.Contains(value) {
this = append(this, value)
}
}
func (this FooSet) Length() int {
return len(this)
}
func (this FooSet) Contains(f Foo) bool {
for _, v := range this {
if v.Equals(f) {
return true
}
}
return false
}
func NewSet() FooSet {
return FooSet(make([]Foo, 0, 100))
}
I benchmarked this on my i7-3770K Windows machine and got:
BenchmarkSmallSetWithFewCollisions 50000 46615 ns/op
BenchmarkSmallSetWithMoreCollisions 50000 46575 ns/op
BenchmarkSmallSetWithManyCollisions 50000 46605 ns/op
BenchmarkMediumSetWithFewCollisions 1000 2335296 ns/op
BenchmarkMediumSetWithMoreCollisions 1000 2352298 ns/op
BenchmarkMediumSetWithManyCollisions 1000 2336796 ns/op
BenchmarkLargeSetWithFewCollisions 50 46805944 ns/op
BenchmarkLargeSetWithMoreCollisions 50 47376016 ns/op
BenchmarkLargeSetWithManyCollisions 50 46815946 ns/op
To eek out a very small amount of performance, you can insert all your data into the array first, and then remove all duplicates after.
The remove duplicates code is:
func (this FooSet) RemoveDuplicates() {
length := len(this) - 1
for i := 0; i < length; i++ {
for j := i + 1; j <= length; j++ {
if this[i].Equals(this[j]) {
this[j] = this[length]
this = this[0:length]
length--
j--
}
}
}
}
The benchmarks for this is:
BenchmarkSmallSetWithFewCollisions 50000 45245 ns/op
BenchmarkSmallSetWithMoreCollisions 50000 45615 ns/op
BenchmarkSmallSetWithManyCollisions 50000 45555 ns/op
BenchmarkMediumSetWithFewCollisions 1000 2294791 ns/op
BenchmarkMediumSetWithMoreCollisions 1000 2309293 ns/op
BenchmarkMediumSetWithManyCollisions 1000 2286290 ns/op
BenchmarkLargeSetWithFewCollisions 50 46235870 ns/op
BenchmarkLargeSetWithMoreCollisions 50 46515906 ns/op
BenchmarkLargeSetWithManyCollisions 50 45865824 ns/op
Here is the benchmark of just assigning Foo to a map[string]Foo.
BenchmarkSmallSetWithFewCollisions 50000 65718 ns/op
BenchmarkSmallSetWithMoreCollisions 50000 64238 ns/op
BenchmarkSmallSetWithManyCollisions 50000 55016 ns/op
BenchmarkMediumSetWithFewCollisions 500 3429435 ns/op
BenchmarkMediumSetWithMoreCollisions 500 3117395 ns/op
BenchmarkMediumSetWithManyCollisions 1000 2826858 ns/op
BenchmarkLargeSetWithFewCollisions 20 82635495 ns/op
BenchmarkLargeSetWithMoreCollisions 20 85285830 ns/op
BenchmarkLargeSetWithManyCollisions 20 73659350 ns/op
It seems to me even if a map was hashable, it still wouldn't perform very well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With