List all the files that do not contain 2 different strings
I have a dir with numerous files named in a pattern e.g file1.txt
I can list all the files that do not contain one string
grep -l "String" file*
How can I list files that do not contain two strings I tried?
grep -l "string1|string2" file*
Assuming you want to just print the names of files that contain ALL strings, here's a solution that will work for any number of strings and will do a string comparison, not a regular expression comparison:
gawk -v RS='\0' -v strings="string1 string2" '
BEGIN{ numStrings = split(strings,stringsA) }
{
matchCnt = 0
for (stringNr=1; stringNr<=numStrings; stringNr++)
if ( index($0,stringsA[stringNr]) )
matchCnt++
}
matchCnt == numStrings { print FILENAME }
' file*
Hang on, I just noticed you want to print the files that do NOT contain 2 strings. That would be:
gawk -v RS='\0' -v strings="string1 string2" '
BEGIN{ numStrings = split(strings,stringsA) }
{
matchCnt = 0
for (stringNr=1; stringNr<=numStrings; stringNr++)
if ( index($0,stringsA[stringNr]) )
matchCnt++
}
matchCnt == numStrings { matchesAll[FILENAME] }
END {
for (fileNr=1; fileNr < ARGC; fileNr++) {
file = ARGV[fileNr]
if (! (file in matchesAll) )
print file
}
}
' file*
To print the names of the files that contain neither string would be:
gawk -v RS='\0' -v strings="string1 string2" '
BEGIN{ numStrings = split(strings,stringsA) }
{
for (stringNr=1; stringNr<=numStrings; stringNr++)
if ( index($0,stringsA[stringNr]) )
matchesOne[FILENAME]
}
END {
for (fileNr=1; fileNr < ARGC; fileNr++) {
file = ARGV[fileNr]
if (! (file in matchesOne) )
print file
}
}
' file*
You need the parameter e
for grep, or using egrep.
With egrep:
egrep -L "string1|string2" file*
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With