Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to grep for a pattern in the files in tar archive without filling up disk space

I have a tar archive which is very big ~ 5GB.

I want to grep for a pattern on all files (and also print the name of the file that has the pattern ) in the archive but do not want to fill up my disk space by extracting the archive.

Anyway I can do that?

I tried these, but this does not give me the file names that contain the pattern, just the matching lines:

tar -O -xf test.tar.gz | grep 'this'
tar -xf test.tar.gz --to-command='grep awesome'

Also where is this feature of tar documented? tar xf test.tar $FILE

like image 569
Ankur Agarwal Avatar asked Oct 23 '12 23:10

Ankur Agarwal


People also ask

Can we grep in tar file?

On the other side, the grep command has the –label=LABEL option to display LABEL as the filename. It's pretty useful when we grep on Stdin. Therefore, to solve the problem, we can assemble a command like tar … –to-command='grep …' and pass tar's TAR_FILENAME variable to grep's –label option. It works!

How can I view the contents of a tar gz file without extracting it?

Go to the folder where your tar. gz file is located. Right-click on the file and there will be an Open With Archive Manager option. Once you click on this option, the system will open a new window by which you can access and view the contents of the tar.

How can I see the contents of a tar gz file without extracting in Linux?

Use -t switch with tar command to list content of a archive. tar file without actually extracting. You can see that output is pretty similar to the result of ls -l command.

How do I extract only certain files from a tar?

Now, if you want a single file or folder from the “tar” file, you need to use the name of the “tar” file and the path to a single file in it. So, we have used the “tar” command with the “-xvf” option, the name of the “tar” file, and the path of a file to be extracted from it as below.


1 Answers

Seems like nobody posted this simple solution that processes the archive only once:

tar xzf archive.tgz --to-command \
    'grep --label="$TAR_FILENAME" -H PATTERN ; true'

Here tar passes the name of each file in a variable (see the docs) and it is used by grep to print it with each match. Also true is added so that tar doesn't complain about failing to extract files that don't match.

like image 105
Petr Avatar answered Nov 13 '22 02:11

Petr