Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading files in a zip archive, without unzipping the archive

Tags:

ruby

I have a directory with 100+ zip files and I need to read the files inside the zip files to do some data processing, without unzipping the archive.

Is there a Ruby library to read the contents of files in zip archives, without unzipping the file?

Using rubyzip gives an error:

require 'zip'

Zip::File.open('my_zip.zip') do |zip_file|
  # Handle entries one by one
  zip_file.each do |entry|
    # Extract to file/directory/symlink
    puts "Extracting #{entry.name}"
    entry.extract('here')

    # Read into memory
    content = entry.get_input_stream.read
  end
end 

Gives this error:

test.rb:12:in `block (2 levels) in <main>': undefined method `read' for Zip::NullInputStream:Module (NoMethodError)
    from .gem/ruby/gems/rubyzip-1.1.6/lib/zip/entry_set.rb:42:in `call'
    from .gem/ruby/gems/rubyzip-1.1.6/lib/zip/entry_set.rb:42:in `block in each'
    from .gem/ruby/gems/rubyzip-1.1.6/lib/zip/entry_set.rb:41:in `each'
    from .gem/ruby/gems/rubyzip-1.1.6/lib/zip/entry_set.rb:41:in `each'
    from .gem/ruby/gems/rubyzip-1.1.6/lib/zip/central_directory.rb:182:in `each'
    from test.rb:6:in `block in <main>'
    from .gem/ruby/gems/rubyzip-1.1.6/lib/zip/file.rb:99:in `open'
    from test.rb:4:in `<main>'
like image 568
user2840647 Avatar asked Jan 23 '15 00:01

user2840647


People also ask

Can you open Zip files without unzipping?

If you are using Windows 7, 8 or 10, follow the following steps to open any zip files without WinZip or WinRAR. Double click the zip file you wish to extract to open the file explorer. At the top part of the explorer menu, find “Compressed folder tools” and click it. Select the “extract” option that appears below it.

How do I view the contents of a ZIP file?

Also, you can use the zip command with the -sf option to view the contents of the . zip file. Additionally, you can view the list of files in the . zip archive using the unzip command with the -l option.

How can I read the content of a ZIP file without unzipping it in Unix?

To list/view the contents of a compressed file on a Linux host without uncompressing it (and where GZIP is installed), use the "zcat" command.


2 Answers

The Zip::NullInputStream is returned if the entry is a directory and not a file, could that be the case?

Here's a more robust variation of the code:

#!/usr/bin/env ruby

require 'rubygems'
require 'zip'


Zip::File.open('my_zip.zip') do |zip_file|
  # Handle entries one by one
  zip_file.each do |entry|
    if entry.directory?
      puts "#{entry.name} is a folder!"
    elsif entry.symlink?
      puts "#{entry.name} is a symlink!"
    elsif entry.file?
      puts "#{entry.name} is a regular file!"

      # Read into memory
      entry.get_input_stream { |io| content = io.read }

      # Output
      puts content
    else
      puts "#{entry.name} is something unknown, oops!"
    end
  end
end
like image 78
sled Avatar answered Jan 04 '23 19:01

sled


I came across the same issue and checking for if entry.file?, before entry.get_input_stream.read, resolved the issue.

require 'zip'

Zip::File.open('my_zip.zip') do |zip_file|
  # Handle entries one by one
  zip_file.each do |entry|
    # Extract to file/directory/symlink
    puts "Extracting #{entry.name}"
    entry.extract('here')

    # Read into memory
    if entry.file?
      content = entry.get_input_stream.read
    end
  end
end 
like image 32
Shravan Ramamurthy Avatar answered Jan 04 '23 21:01

Shravan Ramamurthy