Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby unable to parse a CSV file: CSV::MalformedCSVError (Illegal quoting in line 1.)

Ubuntu 12.04 LTS

Ruby ruby 1.9.3dev (2011-09-23 revision 33323) [i686-linux]

Rails 3.2.9

Following is the content of my received CSV file:

"date/time","settlement id","type","order id","sku","description","quantity","marketplace","fulfillment","order city","order state","order postal","product sales","shipping credits","gift wrap credits","promotional rebates","sales tax collected","selling fees","fba fees","other transaction fees","other","total" "Mar 1, 2013 12:03:54 AM PST","5481545091","Order","108-0938567-7009852","ALS2GL36LED","Solar Two Directional 36 Bright White LED Security Flood Light with Motion Activated Sensor","1","amazon.com","Amazon","Pasadena","CA","91104-1056","43.00","3.25","0","-3.25","0","-6.45","-3.75","0","0","32.80" 

However when I am trying to parse the CSV file I am getting error:

1.9.3dev :016 > options = { col_sep: ",", quote_char:'"' } => {:col_sep=>",", :quote_char=>"\""}   1.9.3dev :022 > CSV.foreach("/tmp/my_data.csv", options) { |row| puts row } CSV::MalformedCSVError: Illegal quoting in line 1.     from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1925:in `block (2 levels) in shift'     from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `each'     from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `block in shift'     from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `loop'     from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `shift'     from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1791:in `each'     from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1208:in `block in foreach'     from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1354:in `open'     from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1207:in `foreach'     from (irb):22     from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/bin/irb:16:in `<main>' 

Then I tried simplifying the data i.e.

"name","age","email" "jignesh","30","[email protected]" 

however still I am getting the same error:

      1.9.3dev :023 > CSV.foreach("/tmp/my_data.csv", options) { |row| puts row }   CSV::MalformedCSVError: Illegal quoting in line 1.       from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1925:in `block (2 levels) in shift'       from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `each'       from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `block in shift'       from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `loop'       from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `shift'       from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1791:in `each'       from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1208:in `block in foreach'       from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1354:in `open'       from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1207:in `foreach'       from (irb):23       from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/bin/irb:16:in `<main>' 

Again I tried simplifying the data like this:

name,age,email jignesh,30,[email protected] 

and it works.See the output below:

  1.9.3dev :024 > CSV.foreach("/tmp/my_data.csv") { |row| puts row }   name   age   email   jignesh   30   [email protected]    => nil  

But I will be receiving the CSV files having quoted data so removing quotes solution is not actually I am looking for.I am unable to figure out what is causing the error: CSV::MalformedCSVError: Illegal quoting in line 1. in my earlier examples.

I have verified that in the CSV there are no leading/trailing spaces by enabling "Show whitespace characters" and "Show Line Endings" in my text editor.Also I have verified the encoding using following.

  1.9.3dev :026 > File.open("/tmp/my_data.csv").read.encoding   => #<Encoding:UTF-8>  

Note: I tried using CSV.read too but same error with that method.

Can anybody please help me getting out of the problem and make me understand where it is going wrong?

=====================

I just found following post at: http://www.ruby-forum.com/topic/448070 and tried following:

  file_data = file.read   file_data.gsub!('"', "'")   arr_of_arrs = CSV.parse(file_data)    arr_of_arrs.each do |arr|     Rails.logger.debug "=======#{arr}"   end 

and got the following output:

   =======["\xEF\xBB\xBF'date/time'", "'settlement id'", "'type'", "'order id'", "'sku'", "'description'", "'quantity'", "'marketplace'", "'fulfillment'", "'order city'", "'order state'", "'order postal'", "'product sales'", "'shipping credits'", "'gift wrap credits'", "'promotional rebates'", "'sales tax collected'", "'selling fees'", "'fba fees'", "'other transaction fees'", "'other'", "'total'"]     =======["'Mar 1", " 2013 12:03:54 AM PST'", "'5481545091'", "'Order'", "'108-0938567-7009852'", "'ALS2GL36LED'", "'Solar Two Directional 36 Bright White LED Security Flood Light with Motion Activated Sensor'", "'1'", "'amazon.com'", "'Amazon'", "'Pasadena'", "'CA'", "'91104-1056'", "'43.00'", "'3.25'", "'0'", "'-3.25'", "'0'", "'-6.45'", "'-3.75'", "'0'", "'0'", "'32.80'"] 

which messed up reading the data properly as the default col_sep used is a comma character. However I tried using quote_char option like this:

  arr_of_arrs = CSV.parse(file_data, :quote_char => "'") 

but it ended up the following error:

   CSV::MalformedCSVError (Illegal quoting in line 1.): 

Thanks, Jignesh

like image 809
Jignesh Gohel Avatar asked May 27 '13 12:05

Jignesh Gohel


People also ask

What is parse CSV?

The csv-parse package is a parser converting CSV text input into arrays or objects. It is part of the CSV project. It implements the Node. js stream.


2 Answers

quote_chars = %w(" | ~ ^ & *) begin   @report = CSV.read(csv_file, headers: :first_row, quote_char: quote_chars.shift) rescue CSV::MalformedCSVError   quote_chars.empty? ? raise : retry  end 

it's not perfect but it works most of the time.

N.B. CSV.parse takes the same parameters as CSV.read, so either a file or data from memory can be used

like image 74
Vadym Tyemirov Avatar answered Sep 21 '22 09:09

Vadym Tyemirov


Anand, thank you for the encoding suggestion. This solved the illegal quoting problem for me.

Note: If you want the iterator to skip over the header row add headers: :first_row, like so:

CSV.foreach("test.csv", encoding: "bom|utf-8", headers: :first_row) 
like image 31
theUtherSide Avatar answered Sep 19 '22 09:09

theUtherSide