Question

I have a string of binary data and I need it as an IO object. So I tried this:

r, w = IO.pipe()
w << data

But it fails with this error:

Encoding::UndefinedConversionError ("\xD0" from ASCII-8BIT to UTF-8)

Why is it trying to convert to UTF-8 in the first place? is there a way to force the IO::pipe method to a binary mode?

More details:

I'm trying to read binary data (which is an Excel file) from MongoDB using the Mongoid driver, and then convert it to an IO object in order to use the Spreadsheet gem to read it. Spreadsheet#open expects either a file path or an IO object.

Here's how my file document looks:

class ImportedFile
    include Mongoid::Document

    field :file_name, type: String
    field :binary_content, type: Moped::BSON::Binary
end

Here's how I saved the binary data in the first place:

imported_file = ImportedFile.new
imported_file.file_name = uploaded_file.original_filename
imported_file.binary_content = Moped::BSON::Binary.new(:generic, uploaded_file.read)
imported_file.save

And here's how I'm trying to read it (doesn't work):

imported_file = ImportedFile.find(file_id)

r, w = IO.pipe()
w << imported_file.binary_content.data
book = Spreadsheet.open r
Was it helpful?

Solution

You could possibly use a StringIO for this:

require 'stringio'

io = StringIO.new(binary_data)
book = Spreadsheet.open(io)

OTHER TIPS

Do not use raw StringIO for binary data. I see that nobody tested StringIO in real world.

bin = ["d9a1a2"].pack("H*")
puts bin.encoding
puts bin[0].unpack("H*")
puts "----"

io = StringIO.new bin
puts io.string.encoding
puts io.string[0].unpack("H*")
puts "----"

io = StringIO.new
io << bin
puts io.string.encoding
puts io.string[0].unpack("H*")
io.string.force_encoding Encoding::BINARY
puts io.string.encoding
puts io.string[0].unpack("H*")
puts "----"

io = StringIO.new
io.binmode
io << bin
puts io.string.encoding
puts io.string[0].unpack("H*")
io.string.force_encoding Encoding::BINARY
puts io.string.encoding
puts io.string[0].unpack("H*")
puts "----"

io = StringIO.new
io.set_encoding Encoding::BINARY
io << bin
puts io.string.encoding
puts io.string[0].unpack("H*")
puts "----"

ruby-2.3.3

ASCII-8BIT
d9
----
ASCII-8BIT
d9
----
UTF-8
d9a1
ASCII-8BIT
d9
----
ASCII-8BIT
d9
ASCII-8BIT
d9
----
ASCII-8BIT
d9
----

rbx-3.72

ASCII-8BIT
d9
----
ASCII-8BIT
d9
----
UTF-8
d9a1
ASCII-8BIT
d9
----
UTF-8
d9a1
ASCII-8BIT
d9
----
ASCII-8BIT
d9
----

jruby-9.1.7.0

ASCII-8BIT
d9
----
ASCII-8BIT
d9
----
UTF-8
d9a1
ASCII-8BIT
d9
----
UTF-8
d9a1
ASCII-8BIT
d9
----
ASCII-8BIT
d9
----
  1. Do not use raw StringIO ever.
  2. Do not trust binmode. It's not a stub only for MRI.
  3. Use io.set_encoding Encoding::BINARY or io.string.force_encoding Encoding::BINARY.
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top