Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I skip the header row when reading a CSV in Ruby? [duplicate]

Tags:

ruby

csv

Ruby's CSV class makes it pretty easy to iterate over each row:

CSV.foreach(file) { |row| puts row }

However, this always includes the header row, so I'll get as output:

header1, header2
foo, bar
baz, yak

I don't want the headers though. Now, when I call …

CSV.foreach(file, :headers => true)

I get this result:

#<CSV::Row:0x10112e510
    @header_row = false,
    attr_reader :row = [
        [0] [
            [0] "header1",
            [1] "foo"
        ],
        [1] [
            [0] "header2",
            [1] "bar"
        ]
    ]
>

Of course, because the documentation says:

This setting causes #shift to return rows as CSV::Row objects instead of Arrays

But, how can I skip the header row, returning the row as a simple array? I don't want the complicated CSV::Row object to be returned.

I definitely don't want to do this:

first = true
CSV.foreach(file) do |row|
  if first
    puts row
    first = false
  else
    # code for other rows
  end
end
like image 461
slhck Avatar asked Jul 31 '12 12:07

slhck


People also ask

How do I ignore headers when reading CSV?

To read CSV file without header, use the header parameter and set it to “None” in the read_csv() method.

How do I skip the first row in a CSV file?

To make it skip one item before your loop, simply call next(reader, None) and ignore the return value. You can also simplify your code a little; use the opened files as context managers to have them closed automatically: with open("tmob_notcleaned. csv", "rb") as infile, open("tmob_cleaned.

What is a CSV header row?

A header of the CSV file is an array of values assigned to each of the columns. It acts as a row header for the data. Initially, the CSV file is converted to a data frame and then a header is added to the data frame. The contents of the data frame are again stored back into the CSV file.


3 Answers

Look at #shift from CSV Class:

The primary read method for wrapped Strings and IOs, a single row is pulled from the data source, parsed and returned as an Array of fields (if header rows are not used)

An Example:

require 'csv'

# CSV FILE
# name, surname, location
# Mark, Needham, Sydney
# David, Smith, London

def parse_csv_file_for_names(path_to_csv)
  names = []  
  csv_contents = CSV.read(path_to_csv)
  csv_contents.shift
  csv_contents.each do |row|
    names << row[0]
  end
  return names
end
like image 169
waldyr.ar Avatar answered Oct 20 '22 01:10

waldyr.ar


You might want to consider CSV.parse(csv_file, { :headers => false }) and passing a block, as mentioned here

like image 16
jodell Avatar answered Oct 20 '22 02:10

jodell


A cool way to ignore the headers is to read it as an array and ignore the first row:

data = CSV.read("dataset.csv")[1 .. -1]
# => [["first_row", "with data"],
      ["second_row", "and more data"],
      ...
      ["last_row", "finally"]]

The problem with the :headers => false approach is that CSV won't try to read the first row as a header, but will consider it part of the data. So, basically, you have a useless first row.

like image 9
agarie Avatar answered Oct 20 '22 01:10

agarie