Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby CSV - get current line/row number

Tags:

ruby

csv

I'm trying to work out how to get the current line/row number from Ruby CSV. This is my code:

options = {:encoding => 'UTF-8', :skip_blanks => true}
CSV.foreach("data.csv", options, ) do |row, i|
   puts i
end

But this doesn't seem to work as expected. Is there a way to do this?

like image 976
user1513388 Avatar asked Sep 13 '12 13:09

user1513388


3 Answers

Because of changes in CSV in current Rubies, we need to make some changes. See farther down in the answer for the original solution with Ruby prior to 2.6. and the use of with_index which continues to work regardless of the version.

For 2.6+ this'll work:

require 'csv'

puts RUBY_VERSION

csv_file = CSV.open('test.csv')
csv_file.each do |csv_row|
  puts '%i %s' % [csv_file.lineno, csv_row]
end
csv_file.close

If I read:

Year,Make,Model,Description,Price
1997,Ford,E350,"ac, abs, moon",3000.00
1999,Chevy,"Venture ""Extended Edition""","",4900.00
1999,Chevy,"Venture ""Extended Edition, Very Large""","",5000.00
1996,Jeep,Grand Cherokee,"MUST SELL!\nair, moon roof, loaded",4799.00

The code results in this output:

2.6.3
1 ["Year", "Make", "Model", "Description", "Price"]
2 ["1997", "Ford", "E350", "ac, abs, moon", "3000.00"]
3 ["1999", "Chevy", "Venture \"Extended Edition\"", "", "4900.00"]
4 ["1999", "Chevy", "Venture \"Extended Edition, Very Large\"", "", "5000.00"]
5 ["1996", "Jeep", "Grand Cherokee", "MUST SELL!\\nair, moon roof, loaded", "4799.00"]

The change is because we have to get access to the current file handle. Previously we could use the global $., which always had a possibility of failure because globals can get stomped on by other sections of called code. If we have the handle of the file being opened, then we can use lineno without that concern.


$.

Ruby prior to 2.6 would let us do this:

Ruby has a magic variable $. which is the line number of the current file being read:

require 'csv'

CSV.foreach('test.csv') do |csv|
  puts $.
end

with the code above, I get:

1
2
3
4
5

$INPUT_LINE_NUMBER

$. is used all the time in Perl. In Ruby, it's recommended we use it the following way to avoid the "magical" side of it:

require 'english'

puts $INPUT_LINE_NUMBER

If it's necessary to deal with embedded line-ends in fields, it's easily handled by a minor modification. Assuming a CSV file "test.csv" which contains a line with an embedded new-line:

Year,Make,Model,Description,Price
1997,Ford,E350,"ac, abs, moon",3000.00
1999,Chevy,"Venture ""Extended Edition""","",4900.00
1996,Jeep,Grand Cherokee,"MUST SELL!
air, moon roof, loaded",4799.00
1999,Chevy,"Venture ""Extended Edition, Very Large""","",5000.00

with_index

Using Enumerator's with_index(1) makes it easy to keep track of the number of times CSV yields to the block, effectively simulating using $. but honoring CSV's work when reading the extra lines necessary to deal with the line-ends:

require 'csv'

CSV.foreach('test.csv', headers: true).with_index(1) do |row, ln|
  puts '%-3d %-5s %-26s %s' % [ln, *row.values_at('Make', 'Model', 'Description')]
end

Which, when run, outputs:

$ ruby test.rb
1   Ford  E350                       ac, abs, moon
2   Chevy Venture "Extended Edition"
3   Jeep  Grand Cherokee             MUST SELL!
air, moon roof, loaded
4   Chevy Venture "Extended Edition, Very Large"
like image 194
the Tin Man Avatar answered Oct 18 '22 14:10

the Tin Man


Here's an alternative solution:

options = {:encoding => 'UTF-8', :skip_blanks => true}

CSV.foreach("data.csv", options).with_index do |row, i|
   puts i
end
like image 35
Josh Voigts Avatar answered Oct 18 '22 14:10

Josh Voigts


Not a clean but a simple solution

options = {:encoding => 'UTF-8', :skip_blanks => true}
i = 0
CSV.foreach("data.csv", options) do | row |
  puts i
  i += 1
end
like image 6
undur_gongor Avatar answered Oct 18 '22 13:10

undur_gongor