Does ruby open-uri HTTP Streaming throttle the download or save to a temp file?

Question

I have a large CSV file on a server I'd like to download and process in chunks, without reading the whole thing into memory. After a bit of finagling I've come up with this:

require open-uri

open("http://example.com/#{LARGE_CSV_FILE}") do |file|
  file.each_slice(50_000) do |fifty_thousand_lines|
    MyModel.import fifty_thousand_lines.join
  end
end

My understanding is that open-uri's #open will wrap the HTTP GET and return an IO-like enumerable object. #each_slice(n) will pass the block an array of n lines at a time. I then join and process those lines.

This imports just fine, and watching my OS X iStat menu, it looks like the memory usage of the ruby process doesn't get out of hand. However, it looks like I downloaded all of the file at once. How can this be without the memory usage exploding?

Does ruby download it to a temporary file and then read it from disk line by line? I would have thought open-uri would instead throttle the HTTP connection and only download more data when its block has finished processing its batch of data.

Is this the right way of downloading and processing a file without loading it all into memory?

Chris Heald · Accepted Answer

Yes, it does download to a tempfile. This is easily observed from the console:

2.0.0-p247 :001 > require 'open-uri'
 => true
2.0.0-p247 :002 > f = open("http://stackoverflow.com/questions/19279715/does-ruby-open-uri-http-streaming-throttle-the-download-or-save-to-a-temp-file")
 => #<Tempfile:/tmp/open-uri20140220-27172-1kcjwk2>

Does ruby open-uri HTTP Streaming throttle the download or save to a temp file?

Tags:

ruby

open-uri

Gabe Durazo

1 Answers

Chris Heald

Recent Activity

Donate For Us

Does ruby open-uri HTTP Streaming throttle the download or save to a temp file?

Tags:

ruby

open-uri

Gabe Durazo

1 Answers

Chris Heald

Related questions

Recent Activity

Donate For Us