Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I grep for a text pattern in a zipped text file?

Tags:

Our daily feed file averages 2 GB in size. These files get archived to a single zip file at the end of each month and stored in a network share. From time to time, I have a need to search for certain records in those files. I do this by connecting by remote desktop to the shared server, unzip the files to a temp folder, run grep (or PowerShell) search, and then delete the temp folder. Now, because our server is running low in disk space, it is no longer recommeded to unzip them all to a temp folder. What is an efficient way to do a regex search on those zipped files with minimum impact on disk or network resources?

like image 438
dawntrader Avatar asked Aug 08 '09 20:08

dawntrader


People also ask

How do I grep a pattern in a zip file?

Unfortunately, grep doesn't work on compressed files. To overcome this, people usually advise to first uncompress the file(s), and then grep your text, after that finally re-compress your file(s)… You don't need to uncompress them in the first place. You can use zgrep on compressed or gzipped files.


2 Answers

The PowerShell Community Extensions (PSCX) include Read-Archive and Expand-Archive cmdlets, but don't (yet?) include a navigation provider which would make what you want very simple. That said, you could use Read-Archive and Expand-Archive. Something like this untested bit

Read-Archive -Path foo.zip -Format Zip | `    Where-Object { $_.Name -like "*.txt" } | `       Expand-Archive -PassThru | select-string "myRegex" 

would let you search without extracting the entire archive.

like image 35
Scott Weinstein Avatar answered Oct 18 '22 12:10

Scott Weinstein


zgrep on Linux. If you're on Windows, you can download GnuWin which contains a Windows port of zgrep.

like image 169
Mark Avatar answered Oct 18 '22 12:10

Mark