Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby: How can I read a CSV file that contains two headers in Ruby?

I have a ".CSV" file that I'm trying to parse using CSV in ruby. The file has two rows of headers though and I've never encountered this before and don't know how to handle it. Below is an example of the headers and rows.

Row 1

"Institution ID","Institution","Game Date","Uniform Number","Last Name","First Name","Rushing","","","","","Passing","","","","","","Total Off.","","Receiving","","","Pass Int","","","Fumble Ret","","","Punting","","Punt Ret","","","KO Ret","","","Total TD","Off xpts","","","","Def xpts","","","","FG","","Saf","Points"

Row 2

"","","","","","","Rushes","Gain","Loss","Net","TD","Att","Cmp","Int","Yards","TD","Conv","Plays","Yards","No.","Yards","TD","No.","Yards","TD","No.","Yards","TD","No.","Yards","No.","Yards","TD","No.","Yards","TD","","Kicks Att","Kicks Made","R/P Att","R/P Made","Kicks Att","Kicks Made","Int/Fum Att","Int/Fum Made","Att","Made"

Row 3

"721","AirForce","09/01/12","19","BASKA","DAVID","","","","","","","","","","","","0","0","","","","","","","","","","2","85","","","","","","","","","","","","","","","","","","","0"

There are no returns in the example above I just added them so it would be easier to read. Does CSV have methods available to handle this structure or will I have to write my own methods to handle this? Thanks!

like image 528
daveomcd Avatar asked Jun 06 '13 00:06

daveomcd


2 Answers

It looks like your CSV file was produced from an Excel spreadsheet that has columns grouped like this:

... |        Rushing        |         Passing         | ...
... |Rushes|Gain|Loss|Net|TD|Att|Cmp|Int|Yards|TD|Conv| ...

(Not sure if I restored the groups properly.)

There is no standard tools to work with such kind of CSV files, AFAIK. You have to do the job manually.

  • Read the first line, treat it as first header line.
  • Read the second line, treat it as second header line.
  • Read the third line, treat it as first data line.
  • ...
like image 65
Sergey Bolgov Avatar answered Oct 07 '22 08:10

Sergey Bolgov


I'd recommend using the smarter_csv gem, and manually provide the correct headers:

 require 'smarter_csv'
 options = {:user_provided_headers => ["Institution ID","Institution","Game Date","Uniform Number","Last Name","First Name", ... provide all headers here ... ], 
            :headers_in_file => false}
 data = SmarterCSV.process(filename, options)
 data.pop # to ignore the first header line
 data.pop # to ignore the second header line
 # data now contains an array of hashes with your data

Please check the GitHub page for the options, and examples. https://github.com/tilo/smarter_csv

One option you should use is :user_provided_headers , and then simply specify the headers you want in an array. This way you can work around cases like this.

You will have to do data.pop to ignore the header lines in the file.

like image 32
Tilo Avatar answered Oct 07 '22 07:10

Tilo