The following FINDSTR example fails to find a match.
echo ffffaaa|findstr /l "ffffaaa faffaffddd"
Why?
When the search string contains multiple words, separated with spaces, then findstr will return lines that contain either word (OR). A literal search ( /C:string ) will reverse this behaviour and allow searching for a phrase or sentence. A literal search also allow searching for punctuation characters.
The findstr (short for find string) command is used in MS-DOS to locate files containing a specific string of plain text.
The command sends the specified lines to the standard output device. It is similar to the find command. However, while the find command supports UTF-16, findstr does not. On the other hand, findstr supports regular expressions, which find does not.
findstr /s /i Windows *.* To find all occurrences of lines that begin with FOR and are preceded by zero or more spaces (as in a computer program loop), and to display the line number where each occurrence is found, type: Copy.
Apparantly this is a long standing FINDSTR bug. I think it can be a crippling bug, depending on the circumstances.
I have confirmed the command fails on two different Vista machines, a Windows 7 machine, and an XP machine. I found this findstr - broken ??? link that reports a similar search fails on Windows Server 2003, but it succeeds on Windows 2000.
I've done a number of experiments and it seems all of the following conditions must be met for the potential of a failure:
/I
option)In every failure I have seen, it is always one of the shorter search strings that fails.
It does not matter how the search strings are specified. The same faulty result is achieved using multiple /C:"search"
options and also with the /G:file
option.
The only 3 workarounds I have been able to come up with are:
Use the /I
option if you don't care about case. Obviously this might not meet your needs.
Use the /R
regular expression option. But if you do then you have to make sure you escape any meta-characters in the search so that it matches the result expected of a literal search. This can be problematic as well.
If you are using the /V
option, then use multiple piped FINDSTR commands with one search string each instead of one FINDSTR with multiple searches. This also can be a problem if you have a lot of search strings for which you want to use the /G:file
option.
I hate this bug!!!!
Note - See What are the undocumented features and limitations of the Windows FINDSTR command? for a comprehensive list of FINDSTR idiosyncrasies.
I cannot tell why findstr
may fail with multiple literal strings. However, I can provide a method to work around that annoying bug.
Given that the literal search strings are listed in a text file called search_strings.txt
...:
ffffaaa faffaffddd
..., you can convert it to regular expressions by inserting a backslash in front of every single character:
@echo off
setlocal EnableExtensions DisableDelayedExpansion
> "regular_expressions.txt" (
for /F usebackq^ delims^=^ eol^= %%S in ("search_strings.txt") do (
set "REGEX=" & set "STRING=%%S"
for /F delims^=^ eol^= %%T in ('
cmd /U /V /C echo(!STRING!^| find /V ""
') do (
set "ESCCHR=\%%T"
if "%%T"="<" (set "ESCCHR=%%T") else if "%%T"=">" (set "ESCCHR=%%T")
setlocal EnableDelayedExpansion
for /F "delims=" %%U in ("REGEX=!REGEX!!ESCCHR!") do (
endlocal & set "%%U"
)
)
setlocal EnableDelayedExpansion
echo(!REGEX!
endlocal
)
)
endlocal
Then use the converted file regular_expressions.txt
...:
\f\f\f\f\a\a\a \f\a\f\f\a\f\f\d\d\d
...to do a regular expression search, which seems to work fine also with multiple search strings:
echo ffffaaa| findstr /R /G:"regular_expressions.txt"
The preceding backslashes simply escape every character including those that have a particular meaning in regular expression searches.
The characters <
and >
are excluded from being escaped in order to avoid conflicts with word boundaries, which were expressed by \<
and \>
when appearing at the beginning and at the end of a search string, respectively.
Since regular expressions are limited to 254 characters for findstr
versions past Windows XP (opposed to literal strings, which are limited to 511 characters), the length of the original search strings is limited to 127 characters, because every such character is expressed by two characters due to the escaping.
Here is an alternative approach that only escapes the meta-characters .
, *
, ^
, $
, [
, ]
, \
, "
:
@echo off
setlocal EnableExtensions DisableDelayedExpansion
set "_META=.*^$[]\"^" & rem (including `"`)
> "regular_expressions.txt" (
for /F usebackq^ delims^=^ eol^= %%S in ("search_strings.txt") do (
set "REGEX=" & set "STRING=%%S"
for /F delims^=^ eol^= %%T in ('
cmd /U /V /C echo(!STRING!^| find /V ""
') do (
set "CHR=%%T"
setlocal EnableDelayedExpansion
if not "!_META!"=="!_META:*%%T=!" set "CHR=\!CHR!"
for /F "delims=" %%U in ("REGEX=!REGEX!!CHR!") do (
endlocal & set "%%U"
)
)
setlocal EnableDelayedExpansion
echo(!REGEX!
endlocal
)
)
endlocal
The advantage of this method is that the length of the search strings is no longer limited to 127 characters but to 254 characters minus 1 for every occurring aforementioned meta-character, applying for findstr
versions past Windows XP.
Here is another work-around, using a case-insensitive search with findstr
at the first place, then post-filtering the result by case-sensitive comparisons:
echo ffffaaa|findstr /L /I "ffffaaa faffaffddd"|cmd /V /C set /P STR=""^&if @^^!STR^^!==@^^!STR:ffffaaa=ffffaaa^^! (echo(^^!STR^^!) else if @^^!STR^^!==@^^!STR:faffaffddd=faffaffddd^^! (echo(^^!STR^^!)
The double-escaped exclamation marks ensure the variable STR
is expanded in the explicitly invoked cmd
instance even in case delayed expansion is enabled in the hosting cmd
instance.
By the way, due to what I call a design flaw, searches with literal strings using findstr
never work reliably as soon as they contain backslashes, because such may still be consumed to escape following meta-characters, although not necessary; for example, the search string \.
actually matches .
; to truly match \.
literally, you must specify the search string \\.
. I do not understand why meta-characters are still recognised when doing literal searches, that is not what I call literal.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With