Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract certain files from .zip

Tags:

r

unzip

Is there a way to selectively extract from a .zip archive those files with names matching a pattern?

For example, if I want to use all .csv files from the archive and ignore other files.

Current approach:

zipped_file_names <- unzip('some_archive.zip') # extracts everything, captures file names
csv_nms <-  grep('csv', zipped_file_names, ignore.case=TRUE, value=TRUE)
library('data.table')
comb_tbl <- rbindlist(lapply(csv_nms,  function(x) cbind(fread(x, sep=',', header=TRUE, 
                                                               stringsAsFactors=FALSE), 
                                                         file_nm=x) ), fill=TRUE ) 

Instead of just selecting which ones to read (csv_nms), I'm looking for a way to choose which ones to extract in the first place.

I'm currently on v3.2.2 (Windows).

like image 717
C8H10N4O2 Avatar asked Sep 30 '15 16:09

C8H10N4O2


People also ask

Can you extract one file from a zip file?

Do one of the following: To unzip a single file or folder, open the zipped folder, then drag the file or folder from the zipped folder to a new location. To unzip all the contents of the zipped folder, press and hold (or right-click) the folder, select Extract All, and then follow the instructions.

How do you extract parts of a zip file?

Right-click on any of the zip files that are a part of the collection and click on the "Extract here" or "Extract to folder" option in the pop-up menu. Your zip application will load up and begin decompressing all the files. A progression bar will appear on the screen and once it is fully loaded, it will fade away.

How do I extract a partially downloaded zip file?

To extract, or open, a ZIP file, use an unZIP utility, or an extractor, such as WinZIP or 7ZIP, a free utility from 7ZIP.org. The unZIP utility will open incomplete files, but consult an IT professional for help. The process requires using the command line utility, which is an open door to the operating system.


1 Answers

Thanks to comment from @user20650.

Use two calls to unzip. First with list=TRUE just to get the $Name for the files. Second with files= to extract only the files whose names match the pattern.

  zipped_csv_names <- grep('\\.csv$', unzip('some_archive.zip', list=TRUE)$Name, 
                           ignore.case=TRUE, value=TRUE)
  unzip('some_archive.zip', files=zipped_csv_names)
  comb_tbl <- rbindlist(lapply(zipped_csv_names,  
                               function(x) cbind(fread(x, sep=',', header=TRUE,
                                                       stringsAsFactors=FALSE),
                                                 file_nm=x)), fill=TRUE ) 
like image 122
C8H10N4O2 Avatar answered Oct 23 '22 03:10

C8H10N4O2