Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby CSV does not understand \r\n as row end

Tags:

ruby

csv

I use an iPhone app that periodically emails me a log in CSV format. I have a ruby script that sums the data in that log with older logs. Recently the app developer released an update that, for some unknown reason, added a carriage return to the end of each line, causing my script to fail. According to the docs, :row_end by default should be :auto, which should accept either \r\n or \n (in 1.9.2). I've tried using Ruby 1.8.7, 1.9.2 and FasterCSV with 1.8.7. I get various error messages with these different tries, including

  • CSV::IllegalFormatError
  • Unquoted fields do not allow \r or \n (line 1) (FasterCSV::MalformedCSVError)
  • can't dup NilClass (TypeError)

in 1.9.2. (The \r is not in a field, it's the end of the line!) The data formerly looked like this:

03-12-2012 07:59,120.0,
03-11-2012 08:27,120.0,
03-10-2012 07:57,120.0,

Now it looks like this:

03-12-2012 07:59,120.0,^M
03-11-2012 08:27,120.0,^M
03-10-2012 07:57,120.0,^M

Thinking that CSV may be thinking the ^M is in the last field, I tried adding another comma:

03-12-2012 07:59,120.0,,^M

to no avail.

The only thing I can imagine is that CSV requires all fields to be in double quotes? I can think of various workarounds, such as reading the file in first, chomping the ends, then processing the array with CSV, but first I want to find out what I'm doing wrong. It seems like it should work.

By the way my code is simply:

CSV.foreach(File.join($import_dir, file)) do |record|

and I've tried setting :row_end => "\r\n" to no avail.

I'm on Mac OS X 10.6.8.

like image 431
chetstone Avatar asked May 18 '12 20:05

chetstone


1 Answers

Because CSV needs to read/parse the entire file when row_end is automatic, I needed to do the following to prevent formatting and encoding exceptions.

  • Decode the file through File.read
  • Remove those pesky carriage returns (could be one or more)
  • Parse the cleansed file as CSV
file = File.read(temp_file.path, encoding: 'ISO-8859-1:UTF-8')
file = file.tr("\r", '')

CSV.parse(file, headers: true) do |row|
  # do all the things
end

Note: I am using version Ruby 2.1.3 for a Rails 4 application.

like image 84
rxgx Avatar answered Sep 19 '22 07:09

rxgx