Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does grep hang when run against the / directory?

My question is in two parts :

1) Why does grep hang when I grep all files under "/" ?

for example :

grep -r 'h' ./

(note : right before the hang/crash, I note that I see some "no such device or address" messages , regarding sockets....

Of course, I know that grep shouldn't run against a socket, but I would think that since sockets are just files in Unix, it should return a negative result, rather than crashing.

2) Now, my follow up question : In any case -- how can I grep the whole filesystem? Are there certain *NIX directories which we should leave out when doing this ? In particular, I'm looking for all recently written log files.

like image 951
jayunit100 Avatar asked Nov 01 '11 18:11

jayunit100


People also ask

Why is grep taking so long?

If you're running grep over a very large number of files it will be slow because it needs to open them all and read through them. If you have some idea of where the file you're looking for might be try to limit the number of files that have to be searched through that way.

Does grep work on directories?

The grep command is a useful Linux command for performing file content search. It also enables us to recursively search through a specific directory, to find all files matching a given pattern.

What is faster than grep?

Is fast grep faster? The grep utility searches text files for regular expressions, but it can search for ordinary strings since these strings are a special case of regular expressions. However, if your regular expressions are in fact simply text strings, fgrep may be much faster than grep .

What does the '- V option to grep do?

-v means "invert the match" in grep, in other words, return all non matching lines.


1 Answers

As @ninjalj said, if you don't use -D skip, grep will try to read all your device files, socket files, and FIFO files. In particular, on a Linux system (and many Unix systems), it will try to read /dev/zero, which appears to be infinitely long.

You'll be waiting for a while.

If you're looking for a system log, starting from /var/log is probably the best approach.

If you're looking for something that really could be anywhere in your file system, you can do something like this:

find / -xdev -type f -print0 | xargs -0 grep -H pattern

The -xdev argument to find tells it to stay within a single filesystem; this will avoid /proc and /dev (as well as any mounted filesystems). -type f limits the search to ordinary files. -print0 prints the file names separated by null characters rather than newlines; this avoid problems with files having spaces or other funny characters in their names.

xargs reads a list of file names (or anything else) on its standard input and invokes the specified command on everything in the list. The -0 option works with find's -print0.

The -H option to grep tells it to prefix each match with the file name. By default, grep does this only if there are two or more file names on its command line. Since xargs splits its arguments into batches, it's possible that the last batch will have just one file, which would give you inconsistent results.

Consider using find ... -name '*.log' to limit the search to files with names ending in .log (assuming your log files have such names), and/or using grep -I ... to skip binary files.

Note that all this depends on GNU-specific features. Some of these options might not be available on MacOS (which is based on BSD) or on other Unix systems. Consult your local documentation, and consider installing GNU findutils (for find and xargs) and/or GNU grep.

Before trying any of this, use df to see just how big your root filesystem is. Mine is currently 268 gigabytes; searching all of it would probably take several hours. A few minutes spent (a) restricting the files you search and (b) making sure the command is correct will be well worth the time you spend.

like image 187
Keith Thompson Avatar answered Sep 19 '22 13:09

Keith Thompson