I try to profiling my go library, to find out what is the cause of being so much slower than same thing in c++.
I have simple benchmark
func BenchmarkFile(t *testing.B) {
tmpFile, err := ioutil.TempFile("", TMP_FILE_PREFIX)
fw, err := NewFile(tmpFile.Name())
text := []byte("testing")
for i := 0; i < b.N; i++ {
_, err = fw.Write(text)
}
fw.Close()
}
NewFile return my custom Writer which encodes data to our binary representation, even compress them, and write to file system.
Running go test -bench . -memprofile mem.out -cpuprofile cpu.out
I get
PASS
BenchmarkFile-16 2000000000 0.20 ns/op
ok .../writer/iowriter 9.074s
Than analysing it
# go tool pprof cpu.out
Entering interactive mode (type "help" for commands)
(pprof) top10
930ms of 930ms total ( 100%)
flat flat% sum% cum cum%
930ms 100% 100% 930ms 100%
(pprof)
I even try to write example.go app which is using my writer, and add pprof.StartCPUProfile(f)
as is shown in http://blog.golang.org/profiling-go-programs but with same result.
What am I doing wrong, and how can I determine what is bottleneck of my lib? Thank you in advance
Creating memory profiles in GoLang To create a memory profile we simply use this command: 1 go test -memprofile mem.prof -bench.
Profiling is an important task that cannot be avoided for larger applications. Profiling helps us understand CPU and memory intensive code and helps us write better code for optimization. In this post, we are going to take a look at the pprof package which helps us do the profiling in Go. What is profiling?
It’s useful for identifying where your application is spending its time (CPU and memory). net/http/pprof serves via its HTTP server runtime profiling data in the format expected by the runtime/pprof visualization tool. pkg/profile provides a simple way to manage runtime/pprof profiling of your Go application
The new code defines a flag named cpuprofile, calls the Go flag library to parse the command line flags, and then, if the cpuprofile flag has been set on the command line, starts CPU profiling redirected to that file.
Ok it's easy, I miss to add binary to go tool pprof, si it has to be
# go tool pprof write cpu.out
Entering interactive mode (type "help" for commands)
(pprof) top10
7.02s of 7.38s total (95.12%)
Dropped 14 nodes (cum <= 0.04s)
Showing top 10 nodes out of 32 (cum >= 0.19s)
flat flat% sum% cum cum%
6.55s 88.75% 88.75% 6.76s 91.60% syscall.Syscall
...
and when using benchmark tests, binary is created there and using it gives same result.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With