Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read compressed csv file on-the-fly

Tags:

ruby

csv

bzip2

I have wrote some csv file and compress it, using this code:

arr = (0...2**16).to_a
File.open('file.bz2', 'wb') do |f|
  writer = Bzip2::Writer.new f
  CSV(writer) do |csv|
    (2**16).times { csv << arr }
  end
  writer.close
end

I want to read this csv bzip2ed file (csv files compressed with bzip2). These files uncompressed look like:

1,2
4,12
5,2
8,7
1,3
...

So I tried this code:

Bzip2::Reader.open(filename) do |bzip2|
  CSV.foreach(bzip2) do |row|
    puts row.inspect
  end
end

but when it is executed, it throws:

/Users/foo/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/csv.rb:1256:in `initialize': no implicit conversion of Bzip2::Reader into String (TypeError)
from /Users/foo/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/csv.rb:1256:in `open'
from /Users/foo/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/csv.rb:1256:in `open'
from /Users/foo/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/csv.rb:1121:in `foreach'
from worm_pathfinder_solver.rb:79:in `block in <main>'
from worm_pathfinder_solver.rb:77:in `open'
from worm_pathfinder_solver.rb:77:in `<main>'

Question:

What is wrong? How should I do?

like image 856
Israel Avatar asked May 04 '14 03:05

Israel


People also ask

Are CSV files compressed?

CSV files are plain-text files (except when they are compressed), and this makes them ubiquitous, and supported on all platforms and all major software.


2 Answers

CSV.foreach assumes you're passing a file path to open. If you want to pass a stream to CSV you need to be more explicit and use CSV.new. This code will process a gzipped file:

Zlib::GzipReader.open(filename) do |gzip|
  csv = CSV.new(gzip)
  csv.each do |row|
    puts row.inspect
  end
end
like image 165
Brian Durand Avatar answered Oct 01 '22 15:10

Brian Durand


Based on the brief docs you'll probably need send the read method on bzip2 object (not tested):

Bzip2::Reader.open(filename) do |bzip2|
  CSV.foreach(bzip2.read) do |row|
    #               ^^^^
    puts row.inspect
  end
end
like image 26
David Unric Avatar answered Oct 01 '22 16:10

David Unric