I'm scraping HTML pages and have set up a HTTP client like so:
client := *http.Client{
Transport: &http.Transport{
Dial: (&net.Dialer{
Timeout: 30 * time.Second,
KeepAlive: 30 * time.Second,
}).Dial,
TLSHandshakeTimeout: 10 * time.Second,
ResponseHeaderTimeout: 10 * time.Second,
},
}
Now when I make GET requests of multiple URLs I don't want to get stuck with URLs that deliver massive amount of data.
response, err := client.Get(page.Url)
checkErr(err)
body, err := ioutil.ReadAll(response.Body)
checkErr(err)
page.Body = string(body)
Is there a way to limit the amount of data (bytes) the GET request accepts from a resource and stops?
The default value of the HTTP and HTTPS connector maximum post size is 2MB. However you can adjust the value as per your requirement. The below command to set the connector to accept maximum 100,000 bytes. If the http request POST size exceeds the 100,000 bytes then connector return HTTP/1.1 400 Bad Request.
The default time out value is 30 seconds. Thus, by default, the server will close the connection if idle for more than 30 seconds. The maximum value for this parameter is 300 seconds (5 minutes). We were able to comprehend how to correct the How Much Time Can Http Request Take Maximum issue thanks to the many examples.
The POST method itself does not have any limit on the size of data.
Yes, you can make it work at least using WCF, it's bit different in MVC and Web API where you add attributes to methods like [GET] [POST] etc..
Use an io.LimitedReader
A LimitedReader reads from R but limits the amount of data returned to just N bytes.
limitedReader := &io.LimitedReader{R: response.Body, N: limit}
body, err := ioutil.ReadAll(limitedReader)
or
body, err := ioutil.ReadAll(io.LimitReader(response.Body, limit))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With