Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Go doesn't release memory after http.Get

I am loading web pages using simple thread pool, while dynamically loading urls from file. But this small program slowly allocate as much memory as my server has, until omm killer stops it. It looks like resp.Body.Close() doesn't free memory for body text (memory size ~ downloaded pages * avg page size). How can I force golang to free memory allocated for body html text?

package main

import (
    "bufio"
    "fmt"
    "io/ioutil"
    "net/http"
    "os"
    "strings"
    "sync"
)

func worker(linkChan chan string, wg *sync.WaitGroup) {
    defer wg.Done()

    for url := range linkChan {
        // Getting body text
        resp, err := http.Get(url)
        if err != nil {
            fmt.Printf("Fail url: %s\n", url)
            continue
        }
        body, err := ioutil.ReadAll(resp.Body)
        resp.Body.Close()
        if err != nil {
            fmt.Printf("Fail url: %s\n", url)
            continue
        }
        // Test page body
        has_rem_code := strings.Contains(string(body), "googleadservices.com/pagead/conversion.js")
        fmt.Printf("Done url: %s\t%t\n", url, has_rem_code)
    }
}

func main() {
    // Creating worker pool
    lCh := make(chan string, 30)
    wg := new(sync.WaitGroup)

    for i := 0; i < 30; i++ {
        wg.Add(1)
        go worker(lCh, wg)
    }

    // Opening file with urls
    file, err := os.Open("./tmp/new.csv")
    defer file.Close()
    if err != nil {
        panic(err)
    }
    reader := bufio.NewReader(file)

    // Processing urls
    for href, _, err := reader.ReadLine(); err == nil; href, _, err = reader.ReadLine() {
        lCh <- string(href)
    }

    close(lCh)
    wg.Wait()
}

Here is some output from pprof tool:

      flat  flat%   sum%        cum   cum%
   34.63MB 29.39% 29.39%    34.63MB 29.39%  bufio.NewReaderSize
      30MB 25.46% 54.84%       30MB 25.46%  net/http.(*Transport).getIdleConnCh
   23.09MB 19.59% 74.44%    23.09MB 19.59%  bufio.NewWriter
   11.63MB  9.87% 84.30%    11.63MB  9.87%  net/http.(*Transport).putIdleConn
    6.50MB  5.52% 89.82%     6.50MB  5.52%  main.main

Looks like this issue, but it's fixed 2 years ago.

like image 717
Denton Avatar asked Jul 31 '15 14:07

Denton


People also ask

Is there memory leak in Golang?

Memory leak in Golang? We already know, that Golang is a blazing fast language with one of the smoothest learning curve ever. Despite being easy to learn, there are issues which are not trivial to debug in the language and I feel like memory issues are one of them.

How much memory does the Go runtime return?

The Go runtime claims it returned all but 94MB to the OS (the scvg2 lines). Maybe my hunch earlier was correct, or maybe the "Memory" reported is virtual, not physical.

Does runtime GC () release memory to OS?

But reading source code, I can see that runtime.GC () does not release memory to OS, it keeps it handy until scavenger decides that it should be released to OS. It appears that the scavenger doesn't notice that it has an extra 1686 and 1625 MB of memory until 8-10 minutes into program execution.

What happens if there's no memory left to allocate?

If there’s no memory left to allocate you are going to experience things like: If your application dies, it should be because of the number of requests you receive, and not because of an unfortunate memory leak. You could carry the load from the hardware side for a while, but these kinds of issues need immediate fixes.


1 Answers

Found the answer in this thread on golang-nuts. http.Transport saves connections for future reusing in case of request to same host, causing memory bloating in my case (hundreds thousands of different hosts). But disabling KeepAlives totally solves that problem.

Working code:

func worker(linkChan chan string, wg *sync.WaitGroup) {
    defer wg.Done()

    var transport http.RoundTripper = &http.Transport{
        DisableKeepAlives: true,
    }

    c := &http.Client{Transport: transport}

    for url := range linkChan {
        // Getting body text
        resp, err := c.Get(url)
        if err != nil {
            fmt.Printf("Fail url: %s\n", url)
            continue
        }
        body, err := ioutil.ReadAll(resp.Body)
        resp.Body.Close()
        if err != nil {
            fmt.Printf("Fail url: %s\n", url)
            continue
        }
        // Test page body
        has_rem_code := strings.Contains(string(body), "googleadservices.com/pagead/conversion.js")
        fmt.Printf("Done url: %s\t%t\n", url, has_rem_code)
    }
}
like image 180
Denton Avatar answered Sep 20 '22 21:09

Denton