Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I get Zlib to uncompress from S3 stream in Ruby?

Ruby Zlib::GzipReader should be created passing an IO-like object (must have a read method that behaves same as the IO#read).

My problem is that I can't get this IO-like object from AWS::S3 lib. As far as I know, the only way of having a stream from it is passing a block to S3Object#stream.

I already tried:

Zlib::GzipReader.new(AWS::S3::S3Object.stream('file', 'bucket'))
# Wich gaves me error: undefined method `read' for #<AWS::S3::S3Object::Value:0x000000017cbe78>

Does anybody know how can I achieve it?

like image 312
gfpacheco Avatar asked Jun 03 '14 13:06

gfpacheco


1 Answers

A simple solution would be to write the downloaded data to a StringIO, then read it back out:

require 'stringio'

io = StringIO.new
io.write AWS::S3::S3Object.value('file', 'bucket')
io.rewind

gz = Zlib::GzipReader.new(io)
data = gz.read
gz.close

# do something with data ...

A more elaborate way would be to start inflating the gzipped data while the stream is still downloading, which can be achieved with an IO.pipe. Something along the lines of this:

reader, writer = IO.pipe

fork do
  reader.close
  AWS::S3::S3Object.stream('file', 'bucket') do |chunk|
    writer.write chunk
  end
end

writer.close

gz = Zlib::GzipReader.new(reader)
while line = gz.gets
  # do something with line ...
end

gz.close

You can also use Thread instead of fork:

reader, writer = IO.pipe

thread = Thread.new do
  AWS::S3::S3Object.stream('file', 'bucket') do |chunk|
    writer.write chunk
  end
  writer.close
end

gz = Zlib::GzipReader.new(reader)
while line = gz.gets
  # do something with line
end

gz.close
thread.join
like image 165
Patrick Oscity Avatar answered Nov 15 '22 10:11

Patrick Oscity