Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why passing pointers to channel is slower

I'm a newbie to golang, trying to rewrite my java server project in golang.

I found, passing pointers into channel cause almost 30% performance drop compared to passing values.

Here is a sample snippet: package main import ( "time" "fmt" )

var c = make(chan t, 1024)
// var c = make(chan *t, 1024)
type t struct {
    a uint
    b uint
}

func main() {

    start := time.Now()
    for i := 0; i < 1000; i++ {
        b := t{a:3, b:5}
        // c <- &b
        c <- b
    }
    elapsed := time.Since(start)
    fmt.Println(t2)
}

update. fix the package missing

like image 587
user7305404 Avatar asked Dec 16 '16 06:12

user7305404


2 Answers

As a value it can be stack allocated:

go run -gcflags '-m' tmp.go
# command-line-arguments
./tmp.go:18: inlining call to time.Time.Nanosecond
./tmp.go:24: inlining call to time.Time.Nanosecond
./tmp.go:25: t2 escapes to heap
./tmp.go:25: main ... argument does not escape
63613

As a pointer, it escapes to the heap:

go run -gcflags '-m' tmp.go
# command-line-arguments
./tmp.go:18: inlining call to time.Time.Nanosecond
./tmp.go:24: inlining call to time.Time.Nanosecond
./tmp.go:21: &b escapes to heap <-- Additional GC pressure
./tmp.go:20: moved to heap: b   <-- 
./tmp.go:25: t2 escapes to heap
./tmp.go:25: main ... argument does not escape
122513

Escaping to the heap introduces some overhead / GC pressure.

Looking at the assembly, the pointer version also introduces additional instructions, including:

go run -gcflags '-S' tmp.go
0x0055 00085 (...tmp.go:18) CALL    runtime.newobject(SB)

The non-pointer variant doesn't incur this overhead before calling runtime.chansend1.

like image 103
Martin Gallagher Avatar answered Nov 11 '22 04:11

Martin Gallagher


As a supplement to the good analysis of Martin Gallagher, it must be added that the way you are measuring is suspicious. The performance of such tiny programs varies a lot, so measuring should be done repeatedly. There are also some mistakes in your example.

First: it doesn't compile because the package statement is missing.

Second: there is an important difference between Nanoseconds and Nanosecond

I tried to evaluate your observation this way*:

package main

import (
    "time"
    "fmt"
)

const (
    chan_size = 1000
    cycle_count = 1000
)

var (
    v_ch = make(chan t, chan_size)
    p_ch = make(chan *t, chan_size)
)

type t struct {
    a uint
    b uint
}

func fill_v() {
    for i := 0; i < chan_size; i++ {
        b := t{a:3, b:5}
        v_ch <- b
    }
}

func fill_p() {
    for i := 0; i < chan_size; i++ {
        b := t{a:3, b:5}
        p_ch <- &b
    }
}

func measure_f(f func()) int64 {
    start := time.Now()
    f();
    elapsed := time.Since(start)
    return elapsed.Nanoseconds()
}

func main() {

    var v_nanos int64 = 0
    var p_nanos int64 = 0
    for i := 0; i<cycle_count; i++ {
        v_nanos += measure_f(fill_v);
        for i := 0; i < chan_size; i++ {
            _ = <- v_ch
        }
    }
    for i := 0; i<cycle_count; i++ {
        p_nanos += measure_f(fill_p);
        for i := 0; i < chan_size; i++ {
            _ = <- p_ch
        }
    }
    fmt.Println(
        "v:",v_nanos/cycle_count, 
        " p:", p_nanos/cycle_count, 
        "ratio (v/p):", float64(v_nanos)/float64(p_nanos))
}

There is indeed a measurable performance drop (I define drop like this drop=1-(candidate/optimum)), but although I repeat the code 1000 times, it varies between 25% and 50%, I'm not even sure how the heap is recycled and when, so it maybe hard to quantify at all.


*see a "running" demo on ideone

...note that stdout is frozen: v: 34875 p: 59420 ratio (v/p)0.586923845267128

For some reason, it was not possible to run this code in the Go Playground

like image 35
Wolf Avatar answered Nov 11 '22 02:11

Wolf