Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parse CSV file with header fields as attributes for each row

Tags:

parsing

ruby

csv

I would like to parse a CSV file so that each row is treated like an object with the header-row being the names of the attributes in the object. I could write this, but I'm sure its already out there.

Here is my CSV input:

"foo","bar","baz" 1,2,3 "blah",7,"blam" 4,5,6 

The code would look something like this:

CSV.open('my_file.csv','r') do |csv_obj|   puts csv_obj.foo   #prints 1 the 1st time, "blah" 2nd time, etc   puts csv.bar       #prints 2 the first time, 7 the 2nd time, etc end 

With Ruby's CSV module I believe I can only access the fields by index. I think the above code would be a bit more readable. Any ideas?

like image 514
Poul Avatar asked Sep 15 '10 12:09

Poul


People also ask

Can CSV files have headers?

A header of the CSV file is an array of values assigned to each of the columns. It acts as a row header for the data. Initially, the CSV file is converted to a data frame and then a header is added to the data frame. The contents of the data frame are again stored back into the CSV file.

How does CSV separate rows?

A CSV file stores data in rows and the values in each row is separated with a separator, also known as a delimiter. Although the file is defined as Comma Separated Values, the delimiter could be anything. The most common delimiters are: a comma (,), a semicolon (;), a tab (\t), a space ( ) and a pipe (|).


2 Answers

Using Ruby 1.9 and above, you can get a an indexable object:

CSV.foreach('my_file.csv', :headers => true) do |row|   puts row['foo'] # prints 1 the 1st time, "blah" 2nd time, etc   puts row['bar'] # prints 2 the first time, 7 the 2nd time, etc end 

It's not dot syntax but it is much nicer to work with than numeric indexes.

As an aside, for Ruby 1.8.x FasterCSV is what you need to use the above syntax.

like image 186
Peer Allan Avatar answered Oct 19 '22 20:10

Peer Allan


Here is an example of the symbolic syntax using Ruby 1.9. In the examples below, the code reads a CSV file named data.csv from Rails db directory.

:headers => true treats the first row as a header instead of a data row. :header_converters => :symbolize parameter then converts each cell in the header row into Ruby symbol.

CSV.foreach("#{Rails.root}/db/data.csv", {:headers => true, :header_converters => :symbol}) do |row|   puts "#{row[:foo]},#{row[:bar]},#{row[:baz]}" end 

In Ruby 1.8:

require 'fastercsv' CSV.foreach("#{Rails.root}/db/data.csv", {:headers => true, :header_converters => :symbol}) do |row|   puts "#{row[:foo]},#{row[:bar]},#{row[:baz]}" end 

Based on the CSV provided by the Poul (the StackOverflow asker), the output from the example code above will be:

1,2,3 blah,7,blam 4,5,6 

Depending on the characters used in the headers of the CSV file, it may be necessary to output the headers in order to see how CSV (FasterCSV) converted the string headers to symbols. You can output the array of headers from within the CSV.foreach.

row.headers 
like image 40
scarver2 Avatar answered Oct 19 '22 20:10

scarver2