I am using rest-client to download large page (around 1.5 GB in size). Retrieved value is stored in memory than saved into a file. As result my program crashes with failed to allocate memory (NoMemoryError)
.
But it is not necessary to keep this data in memory, it may be even saved directly to disk.
I found "You can: (...) manually handle the response (e.g. to operate on it as a stream rather than reading it all into memory) See RestClient::Request's documentation for more information." on https://github.com/rest-client/rest-client Unfortunately after reading http://www.rubydoc.info/gems/rest-client/1.7.3/RestClient/Request I have no idea how it may be accomplished.
I am also aware that I may use other library (Using WWW:Mechanize to download a file to disk without loading it all in memory first) but my program is already using rest-client.
Simplified code:
data = RestClient::Request.execute(:method => :get, :url => url, :timeout => 3600)
file = File.new(filename, 'w')
file.write data
file.close
Code - https://github.com/mkoniecz/CartoCSSHelper/blob/395deab626209bcdafd675c2d8e08d0e3dd0c7f9/downloader.rb#L126
Another way is to use raw_response
. This saves directly to a file, usually in /tmp
and handles redirects without a problem.
See Streaming Responses. Here's their example:
>> raw = RestClient::Request.execute(
method: :get,
url: 'http://releases.ubuntu.com/16.04.2/ubuntu-16.04.2-desktop-amd64.iso',
raw_response: true)
=> <RestClient::RawResponse @code=200, @file=#<Tempfile:/tmp/rest-client.20170522-5346-1pptjm1>, @request=<RestClient::Request @method="get", @url="http://releases.ubuntu.com/16.04.2/ubuntu-16.04.2-desktop-amd64.iso">>
>> raw.file.size
=> 1554186240
>> raw.file.path
=> "/tmp/rest-client.20170522-5346-1pptjm1"
My original answer promoted passing a block to RestClient::Request#execute
but this only passed data to the block once full response is read. Thus rendering the exercise worthless. This is how to properly do it:
File.open('/tmp/foo.iso', 'w') {|f|
block = proc { |response|
response.read_body do |chunk|
puts "Working on response"
f.write chunk
end
}
RestClient::Request.new(method: :get, url: 'http://mirror.pnl.gov/releases/xenial/ubuntu-16.04-server-amd64.iso', block_response: block).execute
}
It is from the related rest-client project issue.
Note: redirection does not work in this mode as well you lose HTTP exit status, cookies, headers, etc. Hope this is gonna be fixed some day.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With