How to pre-process CSV data for FasterCSV?

Question

We're having a significant number of problems creating a bulk upload function for our little app. We're using the FasterCSV gem to upload data to a MySQL database but he Faster CSV is so twitchy and precise in its requirements that it constantly breaks with malformed CSV errors and time out errors.

The csv files are generally created by users' pasting text from their web sites or from Microsoft Word docs so it is not reasonable to expect that there will never be odd characters like smart quotes or accents in the data. Also users aren't going to be readily able to identify whether their data is perfect enough for FasterCSV or not. We need to find a way to fix it for them automatically.

Is there a good way or a reliable tool for pre-processing CSV data to fix any nits in the data before having the FasterCSV gem process it?

derfred · Accepted Answer

Try the CSV library in the standard lib. It is more forgiving about malformed CSV: http://ruby-doc.org/stdlib/libdoc/csv/rdoc/index.html

Taryn East · Answer

You can pass the file's encoding type into the FasterCSV options when creating a new instance of the FasterCsv parser. (see docs here: http://fastercsv.rubyforge.org/classes/FasterCSV.html#M000018)

Setting it to utf-8 or the Microsoft encoding should get it past most dodgy extra characters, allowing it to actually parse into your required strings... then you can clean the strings to your heart's content.

There's also something in the docs about "converters" that you can pass in - though this is aimed more at converting, say, numeric or date types, you ight be able to use it to gsub for the dodgy chars.

Tilo · Answer

Try the smarter_csv Gem - you can pass a block to it's proces method and clean-up data before it is used

https://github.com/tilo/smarter_csv

How to pre-process CSV data for FasterCSV?

Tags:

csv

ruby-on-rails

ruby-on-rails-plugins

fastercsv

Katherine Chalmers

3 Answers

derfred

Taryn East

Tilo

Recent Activity

Donate For Us

How to pre-process CSV data for FasterCSV?

Tags:

csv

ruby-on-rails

ruby-on-rails-plugins

fastercsv

Katherine Chalmers

3 Answers

derfred

Taryn East

Tilo

Related questions

Recent Activity

Donate For Us