Need to search a directories with lots of sub-directories for a string inside files:
I'm using:
grep -c -r "string here" *
How can I total count of finds?
How can I output to file only those files with at least one instance?
Using grep -c alone will count the number of lines that contain the matching word instead of the number of total matches. The -o option is what tells grep to output each match in a unique line and then wc -l tells wc to count the number of lines. This is how the total number of matching words is deduced.
Recursive Search To recursively search for a pattern, invoke grep with the -r option (or --recursive ). When this option is used grep will search through all files in the specified directory, skipping the symlinks that are encountered recursively.
It is a versatile pattern that invokes grep with –r. –R option search files recursively from subdirectories, starting from the current directory. The command is run from the top-level directory. For instance /home/abc etc. Grep is a tool for obtaining dependencies while moving from one host to another.
To Search Subdirectories To include all subdirectories in a search, add the -r operator to the grep command. This command prints the matches for all files in the current directory, subdirectories, and the exact path with the filename.
Using Bash's process substitution, this gives what I believe is the output you want? (Please clarify the question if it's not.)
grep -r "string here" * | tee >(wc -l)
This runs grep -r
normally, with output going both to stdout and to a wc -l
process.
It works for me (it gets the total number of 'string here' found in each file). However, it does not display the total for ALL files searched. Here is how you can get it:
grep -c -r 'string' file > out && \
awk -F : '{total += $2} END { print "Total:", total }' out
The list will be in out and the total will be sent to STDOUT.
Here is the output on the Python2.5.4 directory tree:
grep -c -r 'import' Python-2.5.4/ > out && \
awk -F : '{total += $2} END { print "Total:", total }' out
Total: 11500
$ head out
Python-2.5.4/Python/import.c:155
Python-2.5.4/Python/thread.o:0
Python-2.5.4/Python/pyarena.c:0
Python-2.5.4/Python/getargs.c:0
Python-2.5.4/Python/thread_solaris.h:0
Python-2.5.4/Python/dup2.c:0
Python-2.5.4/Python/getplatform.c:0
Python-2.5.4/Python/frozenmain.c:0
Python-2.5.4/Python/pyfpe.c:0
Python-2.5.4/Python/getmtime.c:0
If you just want to get lines with occurrences of 'string', change to this:
grep -c -r 'import' Python-2.5.4/ | \
awk -F : '{total += $2; print $1, $2} END { print "Total:", total }'
That will output:
[... snipped]
Python-2.5.4/Lib/dis.py 4
Python-2.5.4/Lib/mhlib.py 10
Python-2.5.4/Lib/decimal.py 8
Python-2.5.4/Lib/new.py 6
Python-2.5.4/Lib/stringold.py 3
Total: 11500
You can change how the files ($1) and the count per file ($2) is printed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With