Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When golang does allocation for string to byte conversion

Tags:

go

allocation

var testString = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
//var testString = "ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ"
func BenchmarkHashing900000000(b *testing.B){
    var bufByte = bytes.Buffer{}
    for i := 0; i < b.N ; i++{
        bufByte.WriteString(testString)
        Sum32(bufByte.Bytes())
        bufByte.Reset()
    }
}

func BenchmarkHashingWithNew900000000(b *testing.B){
    for i := 0; i < b.N ; i++{
        bytStr := []byte(testString)
        Sum32(bytStr)
    }
}

test result:

With  testString = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
BenchmarkHashing900000000-4         50000000            35.2 ns/op         0 B/op          0 allocs/op
BenchmarkHashingWithNew900000000-4  50000000            30.9 ns/op         0 B/op          0 allocs/op

With testString = "ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ"
BenchmarkHashing900000000-4         30000000            46.6 ns/op         0 B/op          0 allocs/op
BenchmarkHashingWithNew900000000-4  20000000            73.0 ns/op        64 B/op          1 allocs/op

Why there is allocation in case of BenchmarkHashingWithNew900000000 when string is long but no allocation when string is small.
Sum32 : https://gowalker.org/github.com/spaolacci/murmur3
I am using go1.6

like image 566
Vishal Kumar Avatar asked Jul 24 '16 17:07

Vishal Kumar


People also ask

Can we convert string to byte?

We can use String class getBytes() method to encode the string into a sequence of bytes using the platform's default charset. This method is overloaded and we can also pass Charset as argument.

How string is stored in memory Golang?

String in Golang In Go, string uses UTF-8 encoding, lets see how RAM (memory) stored`Hello` string. If you see the first bit of every character is 0 this represents every character uses one bytes (one chunk).

How many bytes is a string Golang?

Hence in Go, all characters are represented in int32 (size of 4 bytes) data type.

What is a [] byte in Golang?

The byte type in Golang is an alias for the unsigned integer 8 type ( uint8 ). The byte type is only used to semantically distinguish between an unsigned integer 8 and a byte. The range of a byte is 0 to 255 (same as uint8 ).


1 Answers

Your benchmarks are observing a curious optimisation by the Golang compiler (version 1.8).

You can see the PR from Dmitry Dvyukov here

https://go-review.googlesource.com/c/go/+/3120

Unfortunately that is from a long time ago, when the compiler was written in C, I am not sure where to find the optimisation in the current compiler. But I can confirm that it still exists, and Dmitry's PR description is accurate.

If you want a clearer self contained set of benchmarks to demonstrate this I have a gist here.

https://gist.github.com/fmstephe/f0eb393c4ec41940741376ab08cbdf7e

If we look only at the second benchmark BenchmarkHashingWithNew900000000 we can see a clear spot where it 'should' allocate.

bytStr := []byte(testString)

This line must copy the contents of testString into a new []byte. However in this case the compiler can see that bytStr is never used again after Sum32 returns. Therefore it can be allocated on the stack. However, as strings can be arbitrarily large a limit is set to 32 bytes for a stack allocated string or []byte.

It's worth being aware of this little trick, because it can be easy to trick yourself into believing some code does not allocate, if your benchmark strings are all short.

like image 107
Francis Stephens Avatar answered Nov 15 '22 03:11

Francis Stephens