var testString = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
//var testString = "ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ"
func BenchmarkHashing900000000(b *testing.B){
var bufByte = bytes.Buffer{}
for i := 0; i < b.N ; i++{
bufByte.WriteString(testString)
Sum32(bufByte.Bytes())
bufByte.Reset()
}
}
func BenchmarkHashingWithNew900000000(b *testing.B){
for i := 0; i < b.N ; i++{
bytStr := []byte(testString)
Sum32(bytStr)
}
}
test result:
With testString = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
BenchmarkHashing900000000-4 50000000 35.2 ns/op 0 B/op 0 allocs/op
BenchmarkHashingWithNew900000000-4 50000000 30.9 ns/op 0 B/op 0 allocs/op
With testString = "ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ"
BenchmarkHashing900000000-4 30000000 46.6 ns/op 0 B/op 0 allocs/op
BenchmarkHashingWithNew900000000-4 20000000 73.0 ns/op 64 B/op 1 allocs/op
Why there is allocation in case of BenchmarkHashingWithNew900000000 when string is long but no allocation when string is small.
Sum32 : https://gowalker.org/github.com/spaolacci/murmur3
I am using go1.6
We can use String class getBytes() method to encode the string into a sequence of bytes using the platform's default charset. This method is overloaded and we can also pass Charset as argument.
String in Golang In Go, string uses UTF-8 encoding, lets see how RAM (memory) stored`Hello` string. If you see the first bit of every character is 0 this represents every character uses one bytes (one chunk).
Hence in Go, all characters are represented in int32 (size of 4 bytes) data type.
The byte type in Golang is an alias for the unsigned integer 8 type ( uint8 ). The byte type is only used to semantically distinguish between an unsigned integer 8 and a byte. The range of a byte is 0 to 255 (same as uint8 ).
Your benchmarks are observing a curious optimisation by the Golang compiler (version 1.8).
You can see the PR from Dmitry Dvyukov here
https://go-review.googlesource.com/c/go/+/3120
Unfortunately that is from a long time ago, when the compiler was written in C, I am not sure where to find the optimisation in the current compiler. But I can confirm that it still exists, and Dmitry's PR description is accurate.
If you want a clearer self contained set of benchmarks to demonstrate this I have a gist here.
https://gist.github.com/fmstephe/f0eb393c4ec41940741376ab08cbdf7e
If we look only at the second benchmark BenchmarkHashingWithNew900000000
we can see a clear spot where it 'should' allocate.
bytStr := []byte(testString)
This line must copy the contents of testString
into a new []byte
. However in this case the compiler can see that bytStr
is never used again after Sum32
returns. Therefore it can be allocated on the stack. However, as strings can be arbitrarily large a limit is set to 32 bytes for a stack allocated string
or []byte
.
It's worth being aware of this little trick, because it can be easy to trick yourself into believing some code does not allocate, if your benchmark strings are all short.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With