Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Check existence of file in archive (zip)

Tags:

r

I'm using unz to extract data from a file within an archive. This actually works pretty well but unfortunately I've a lot of zip files and need to check the existence of a specific file within the archive. I could not manage to get a working solution with if exists or else.

Has anyone an idea how to perform a check if a file exists in an archive without extracting the whole archive before?

Example:

read.table(unz(D:/Data/Test.zip, "data.csv"), sep = ";")[-1,]

This works pretty well if data.csv exists but gives an error if the file is not available in the archive Test.zip.

Error in open.connection(file, "rt") : cannot open the connection  
In addition: Warning message:
In open.connection(file, "rt") :  
  cannot locate file 'data.csv' in zip file 'D:/Data/Test.zip'

Any comments are welcome!

like image 971
kukuk1de Avatar asked Jan 08 '23 23:01

kukuk1de


1 Answers

You could use unzip(file, list = TRUE)$Name to get the names of the files in the zip without having to unzip it. Then you can check to see if the files you need are in the list.

## character vector of all file names in the zip
fileNames <- unzip("D:/Data/Test.zip", list = TRUE)$Name

## check if any of those are 'data.csv' (or others)
check <- basename(fileNames) %in% "data.csv"

## extract only the matching files
if(any(check)) {
    unzip("D:/Data/Test.zip", files = fileNames[check], junkpaths = TRUE)
}

You could probably put another if() statement to run unz() in cases where there is only one matched file name, since it's faster than running unzip() on a single file.

like image 65
Rich Scriven Avatar answered Jan 11 '23 13:01

Rich Scriven