Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby: Binary String to IO

I have a string of binary data and I need it as an IO object. So I tried this:

r, w = IO.pipe()
w << data

But it fails with this error:

Encoding::UndefinedConversionError ("\xD0" from ASCII-8BIT to UTF-8)

Why is it trying to convert to UTF-8 in the first place? is there a way to force the IO::pipe method to a binary mode?

More details:

I'm trying to read binary data (which is an Excel file) from MongoDB using the Mongoid driver, and then convert it to an IO object in order to use the Spreadsheet gem to read it. Spreadsheet#open expects either a file path or an IO object.

Here's how my file document looks:

class ImportedFile
    include Mongoid::Document

    field :file_name, type: String
    field :binary_content, type: Moped::BSON::Binary
end

Here's how I saved the binary data in the first place:

imported_file = ImportedFile.new
imported_file.file_name = uploaded_file.original_filename
imported_file.binary_content = Moped::BSON::Binary.new(:generic, uploaded_file.read)
imported_file.save

And here's how I'm trying to read it (doesn't work):

imported_file = ImportedFile.find(file_id)

r, w = IO.pipe()
w << imported_file.binary_content.data
book = Spreadsheet.open r
like image 735
Rotem Harel Avatar asked Jul 13 '13 19:07

Rotem Harel


2 Answers

You could possibly use a StringIO for this:

require 'stringio'

io = StringIO.new(binary_data)
book = Spreadsheet.open(io)
like image 93
georgebrock Avatar answered Oct 17 '22 13:10

georgebrock


Do not use raw StringIO for binary data. I see that nobody tested StringIO in real world.

bin = ["d9a1a2"].pack("H*")
puts bin.encoding
puts bin[0].unpack("H*")
puts "----"

io = StringIO.new bin
puts io.string.encoding
puts io.string[0].unpack("H*")
puts "----"

io = StringIO.new
io << bin
puts io.string.encoding
puts io.string[0].unpack("H*")
io.string.force_encoding Encoding::BINARY
puts io.string.encoding
puts io.string[0].unpack("H*")
puts "----"

io = StringIO.new
io.binmode
io << bin
puts io.string.encoding
puts io.string[0].unpack("H*")
io.string.force_encoding Encoding::BINARY
puts io.string.encoding
puts io.string[0].unpack("H*")
puts "----"

io = StringIO.new
io.set_encoding Encoding::BINARY
io << bin
puts io.string.encoding
puts io.string[0].unpack("H*")
puts "----"

ruby-2.3.3

ASCII-8BIT
d9
----
ASCII-8BIT
d9
----
UTF-8
d9a1
ASCII-8BIT
d9
----
ASCII-8BIT
d9
ASCII-8BIT
d9
----
ASCII-8BIT
d9
----

rbx-3.72

ASCII-8BIT
d9
----
ASCII-8BIT
d9
----
UTF-8
d9a1
ASCII-8BIT
d9
----
UTF-8
d9a1
ASCII-8BIT
d9
----
ASCII-8BIT
d9
----

jruby-9.1.7.0

ASCII-8BIT
d9
----
ASCII-8BIT
d9
----
UTF-8
d9a1
ASCII-8BIT
d9
----
UTF-8
d9a1
ASCII-8BIT
d9
----
ASCII-8BIT
d9
----
  1. Do not use raw StringIO ever.
  2. Do not trust binmode. It's not a stub only for MRI.
  3. Use io.set_encoding Encoding::BINARY or io.string.force_encoding Encoding::BINARY.
like image 25
puchu Avatar answered Oct 17 '22 13:10

puchu