Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multipart form uploads + memory leaks in golang?

The following server code:

package main

import (
  "fmt"
  "net/http"
)

func handler(w http.ResponseWriter, r *http.Request) {
  file, _, err := r.FormFile("file")
  if err != nil {
    fmt.Fprintln(w, err)
    return
  }
  defer file.Close()

  return
}

func main() {
  http.ListenAndServe(":8081", http.HandlerFunc(handler))
}

being run and then calling it with:

curl -i -F "file=@./large-file" --form hello=world http://localhost:8081/

Where the large-file is about 80MB seems to have some form of memory leak in Go 1.4.2 on darwin/amd64 and linux/amd64.

When I hook up pprof, I see that bytes.makeSlice uses 96MB of memory after calling the service a few times (eventually called by r.FormFile in my code above).

If I keep calling curl, the memory usage of the process grow slows over time, eventually seeming to stick around 300MB on my machine.

Thoughts? I assume this isn't expected/ I'm doing something wrong?

like image 370
Michael Wasser Avatar asked Jun 05 '15 02:06

Michael Wasser


1 Answers

If the memory usage stagnates at a "maximum", I wouldn't really call that a memory leak. I would rather say the GC not being eager and being lazy. Or just don't want to physically free memory if it is frequently reallocated / needed. If it would be really a memory leak, used memory wouldn't stop at 300 MB.

r.FormFile("file") will result in a call to Request.ParseMultipartForm(), and 32 MB will be used as the value of maxMemory parameter (the value of defaultMaxMemory variable defined in request.go). Since you upload a larger file (80 MB), a buffer of size 32 MB at least will be created - eventually (this is implemented in multipart.Reader.ReadFrom()). Since bytes.Buffer is used to read the content, the reading process will start with a small or empty buffer, and reallocate whenever a bigger is needed.

The strategy of buffer reallocations and the buffer sizes are implementation dependent (and also depends on the size of the chunks being read/decoded from the request), but just to have a rough picture, imagine it like this: 0 bytes, 4 KB, 16 KB, 64 KB, 256 KB, 1 MB, 4 MB, 16 MB, 64 MB. Again, this is just theoretical, but illustrates that the sum can even grow beyond 100 MB just to read the first 32 MB of the file in memory at which point it will be decided that it will be moved/stored in file. See the implementation of multipart.Reader.ReadFrom() for details. This reasonably explains the 96 MB allocation.

Do this a couple of times, and without the GC releasing the allocated buffers immediately, you can easily end up with 300 MB. And if there is enough free memory, there is no pressure on the GC to hurry with releasing memory. The reason why you see it growing relatively big is because large buffers are used in the background. Would you do the same with uploading a 1MB file, you would probably not experience this.

If it is important to you, you can also call Request.ParseMultipartForm() manually with a smaller maxMemory value, e.g.

r.ParseMultipartForm(2 << 20) // 2 MB
file, _, err := r.FormFile("file")
// ... rest of your handler

Doing so much smaller (and fewer) buffers will be allocated in the background.

like image 136
icza Avatar answered Sep 29 '22 05:09

icza