Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How and when does Go allocate memory for bounded-queue channels?

Tags:

memory

go

pprof

I'm using Go's pprof tool to investigate my service's memory usage. Almost all of the memory usage comes from a single function that sets up multiple bounded-queue channels. I'm somewhat confused by what pprof is telling me here:

$ go tool pprof ~/pprof/pprof.server.alloc_objects.alloc_space.inuse_objects.inuse_space.007.pb.gz
File: server
Type: inuse_space
Time: Dec 21, 2020 at 10:46am (PST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) list foo
Total: 102.73MB
ROUTINE ======================== github.com/******/foo in ***.go
   79.10MB    79.10MB (flat, cum) 77.00% of Total
         .          .    135:
         .          .    136:func foo() {
         .          .    137:    
   14.04MB    14.04MB    138:    chanA := make(chan chanAEntry, bufferSize)
         .          .    139:    defer close(chanA)
         .          .    140:
         .          .    141:    
   19.50MB    19.50MB    142:    chanB := make(chan chanBCEntry, bufferSize)
         .          .    143:    defer close(chanB)
         .          .    144:
         .          .    145:    
   27.53MB    27.53MB    146:    chanC := make(chan chanBCEntry, bufferSize)
         .          .    147:    defer close(chanC)
         .          .    148:
         .          .    149:    
    7.92MB     7.92MB    150:    chanD := make(chan chanDEntry, bufferSize)
         .          .    151:    defer close(chanD)
         .          .    152:

It looks like line 142 is responsible for 19.50MB of allocations and line 146 is responsible for 27.53MB, but those lines are doing the same thing - they create buffered channels with the same input type and the same capacity.

  • Is this an artifact of the fact that pprof does random sampling?
  • Does Go allocate channels lazily (fwiw, after letting the service run for a few days these values eventually equalize)?
  • Is pprof reporting the memory required by the objects sent along the channel as well as the memory required by the channel itself?
like image 788
Greg Owen Avatar asked Mar 01 '23 20:03

Greg Owen


1 Answers

Ok, I believe that I've figured it out. It looks like Go allocates eagerly and the discrepancy is just due to the way the Go memory profiler takes samples.

Go allocates channel memory eagerly

The docs for make promise that

The channel's buffer is initialized with the specified buffer capacity.

I looked into the code for makechan, which gets called during make(chan chantype, size). It always calls mallocgc directly - no laziness.

Looking into the code for mallocgc, we can confirm that there's no laziness within mallocgc (besides the doc comment not mentioning laziness, mallocgc calls c.alloc directly).

pprof samples at the heap allocation level, not the calling function level

While looking around mallocgc, I found the profiling code. Within each mallocgc call, Go will check to see if its sampling condition is met. If so, it calls mProf_Malloc to add a record to the heap profile. I couldn't confirm that this is the profile used by pprof, but comments in that file suggest that it is.

The sampling condition is based on the number of bytes allocated since the previous sample was taken (it draws from an exponential distribution to sample, on average, after every runtime.MemProfileRate bytes are allocated).

The important part here is that each call to mallocgc has some probability of being sampled, rather than each call to foo. This means that if a call to foo makes multiple calls to mallocgc, we expect that only some of the mallocgc calls will be sampled.

Putting it all together

Every time my function foo is run, it will eagerly allocate memory for the 4 channels. At each memory allocation call, there is a chance that the Go will record a heap profile. On average, Go will record a heap profile every 512kB (the default value of runtime.MemProfileRate). Since the total size of these channels is 488kB, on average we expect only one allocation to be recorded each time foo is called. The profile I shared above was taken relatively soon after the service restarted, so the difference in number of allocated bytes is the result of pure statistical variance. After letting the service run for a day, the profile settled down to show that the number of bytes allocated by lines 142 and 146 were equal.

like image 128
Greg Owen Avatar answered Mar 05 '23 15:03

Greg Owen