I want to find the files not containing a specific string (in a directory and its sub-directories) and remove those files. How I can do this?
The following will work:
find . -type f -print0 | xargs --null grep -Z -L 'my string' | xargs --null rm
This will firstly use find to print the names of all the files in the current directory and any subdirectories. These names are printed with a null terminator rather than the usual newline separator (try piping the output to od -c
to see the effect of the -print0
argument.
Then the --null
parameter to xargs
tells it to accept null-terminated inputs. xargs
will then call grep
on a list of filenames.
The -Z
argument to grep
works like the -print0
argument to find
, so grep will print out its results null-terminated (which is why the final call to xargs
needs a --null
option too). The -L
argument to grep
causes grep
to print the filenames of those files on its command line (that xargs
has added) which don't match the regular expression:
my string
If you want simple matching without regular expression magic then add the -F
option. If you want more powerful regular expressions then give a -E
argument. It's a good habit to use single quotes rather than double quotes as this protects you against any shell magic being applied to the string (such as variable substitution)
Finally you call xargs
again to get rid of all the files that you've found with the previous calls.
The problem with calling grep
directly from the find
command with the -exec
argument is that grep
then gets invoked once per file rather than once for a whole batch of files as xargs
does. This is much faster if you have lots of files. Also don't be tempted to do stuff like:
rm $(some command that produces lots of filenames)
It's always better to pass it to xargs
as this knows the maximum command-line limits and will call rm
multiple times each time with as many arguments as it can.
Note that this solution would have been simpler without the need to cope with files containing white space and new lines.
Alternatively
grep -r -L -Z 'my string' . | xargs --null rm
will work too (and is shorter). The -r
argument to grep
causes it to read all files in the directory and recursively descend into any subdirectories). Use the find ...
approach if you want to do some other tests on the files as well (such as age or permissions).
Note that any of the single letter arguments, with a single dash introducer, can be grouped together (for instance as -rLZ
). But note also that find
does not use the same conventions and has multi-letter arguments introduced with a single dash. This is for historical reasons and hasn't ever been fixed because it would have broken too many scripts.
GNU grep and bash.
grep -rLZ "$str" . | while IFS= read -rd '' x; do rm "$x"; done
Use a find
solution if portability is needed. This is slightly faster.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With