I'm a little confused about how many backslashes are needed to escape the alternation operator | in regular expressions for grep. This
echo abcdef | grep -e"def|zzz"
outputs nothing, because grep is not in extended regex mode. Escaping with one backslash works,
echo abcdef | grep -e"def\|zzz"
prints abcdef. More surprisingly, escaping with 2 backslashes also works,
echo abcdef | grep -e"def\\|zzz"
prints abcdef. Escaping with three backslashes fails,
echo abcdef | grep -e"def\\\|zzz"
prints nothing.
Does anyone have an explanation, especially for the 2-backslash case ?
Edit:
Using this simple argument-printing program,
void main(int argc, char** argv)
{
for (int i = 0; i < argc; i++)
printf("Arg %d: %s\n", i, argv[i]);
}
I investigated what my shell does with the command lines above :
-e"def|zzz" becomes -edef|zzz
-e"def\|zzz" becomes -edef\|zzz
-e"def\\|zzz" becomes -edef\\|zzz
-e"def\\\|zzz" becomes -edef\\\|zzz
So all double-quotes are removed and the backslashes and pipes are not altered by the shell. I suspect grep itself does something special with the literal string \\|.
The lowercase -e option is used to express multiple search operations. The alternation is implied:
$ echo abcdef | grep -e 'def' -e'zzz'
abcdef
$ echo abczzz | grep -e 'def' -e'zzz'
abczzz
Alternatively, you can use the upper -E option for extended regular expression notation:
$ echo abcdef | grep -E 'def|zzz'
abcdef
I believe this solves you problem directly (either using -e for alternation or -E for extended regex notation). Hope this helps :-)
FWIW, the issue with the backslashes is that | has special meaning to bash and needs to be escaped unless it is in single quotes. Here is a resource on quoting and escaping rules and the common pitfalls: https://web.archive.org/web/20230323230844/https://wiki.bash-hackers.org/syntax/quoting
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With