Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Encoding::UndefinedConversionError when writing a binary file

I have a site that needs to encrypt and store binary files that are uploaded to the server. The uploading and storage works fine, but I'm getting this error when trying to write the encrypted file:

Encoding::UndefinedConversionError ("\xDD" from ASCII-8BIT to UTF-8):

The code that causes it looks like this:

fd_in = IO.sysopen(self[:name].tempfile.path, "rb")                           
file_in = IO.open(fd_in)                                                      
fd_out = IO.sysopen(self[:name].tempfile.path + ".encrypted", "wb")           
file_out = IO.open(fd_out)                                                    
cipher = OpenSSL::Cipher::Cipher.new('aes-256-cbc')                           
cipher.encrypt                                                                                                         
cipher.key = cipher_key                                                     
cipher.iv = cipher_iv                                                       
while chunk = file_in.read(1024)                                              
  file_out << cipher.update(chunk)                                            
end
file_out << cipher.final

The line that causes the error is the file_out << cipher.update(chunk) in the while loop. I've looked into this online and found some reports of similar ASCII/UTF conversion issues, but they all appear to be based on coercing string input, not stream file input. I'm using Ruby 1.9.2 which I believe affects default string encoding.

My rationale as to why (I think) I need to use a stream-based approach: the files tend to be large and I do not want to load the entire file (input or output) into memory to process it.

Any help is appreciated. Thanks.

like image 206
Chris Hart Avatar asked Jul 25 '11 02:07

Chris Hart


1 Answers

What you want to do when en-/decrypting is treat input and output as raw bytes, you want to avoid any transcoding caused by associating an encoding with your data at all cost. So you should open your files in binary mode, both for reading and for writing.

Actually you did this, but with IO#sysopen, but then you did not pass the "b" flags when using IO#open.

Your code should work if you rather try this:

fin = File.open("TODO", "rb")                           
fout = File.open("TODO.encrypted", "wb")           
cipher = OpenSSL::Cipher::Cipher.new('aes-256-cbc')
cipher.encrypt                     
cipher.key = key                                                     
cipher.iv = iv                                                       
while chunk = fin.read(1024)                                              
  fout << cipher.update(chunk)                                            
end
fout << cipher.final
fin.close
fout.close 
like image 57
emboss Avatar answered Oct 03 '22 16:10

emboss