Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Zsh backslash madness?

Tags:

zsh

backslash

Zsh seems to do some weird backslashing when you try to echo a bunch of backslashes. I cannot seem to figure out a very clear pattern to this. Any reasons for this madness? Of course, if I actually wanted to use backslashes properly, then I'd use proper quoting etc, but why does this happen in the first place?

Here's a small example to show the same:

$ echo \\
\
$ echo \\ \\
\ \
$ echo \\ \\ \\
\ \ \
$ echo \\ \\ \\ \\
\ \ \ \
$ echo \\\\ \\ \\
\ \ \
$ echo \\\\\\ \\
\\ \
$ echo \\\\\\\\
\\

I initially independently discovered this a while ago, but was reminded of it by this tweet by Zach Riggle.

like image 281
Jay Bosamiya Avatar asked Sep 17 '25 14:09

Jay Bosamiya


1 Answers

In the first step, the echo command is not special. The command line is parsed by rules that are independent of what command is being executed. The overall effect of this step is convert your command from a series of characters to a series of words.

The two general parsing rules you need to know to understand this example are: the space character separates words, and the backslash character escapes special characters, including itself.

So the command echo \\ becomes a list of 2 words:

echo
\

The first backslash escapes the second one, resulting in a single backslash being in the second word.

echo \\ \\ \\ \\

becomes this list of words:

echo
\
\
\
\

Now command line parsing is done. Only now does the shell look for a command named by the first word. Up until now, the fact that the command is echo has been irrelevant. If you'd said cat \\ \\ \\ \\, cat would be invoked with 4 argument words, each containing a single backslash.

Normally when you run echo you'll be getting the shell builtin command. The zsh builtin echo has configurable behavior. I like to use setopt BSD_ECHO to select BSD-style echo behavior, but from your sample output it appears you are in the default mode, SysV-style.

BSD-style echo doesn't do any backslash processing, it would just print them as it received them.

SysV echo processes backslash escapes like in C strings - \t becomes a tab character, \r becomes a carriage return, etc. Also \c is interpreted as "end the output without a newline".

So if you said echo a\\tb then the shell parsing would result in a single backslash in the argument word given to echo, and echo would interpret a\tb and print a and b separated by a tab. It would be more readable if written as echo 'a\tb', using apostrophes to provide quoting at the shell-command-parsing level. Likewise echo \\\\ is two backslashes after command line parsing, so echo sees \\ and outputs one backslash. If you wanted to print literally a\tb without using an other form of quoting, you'd have to says echo a\\\\tb.

So the shell has a simple rule - two backslashes on the command line to make one backslash in the argument word. And echo has a simple rule - two backslashes in the argument word to make one backslash in the output.

But there's a problem... when echo does its thing, a backslash followed by t means output a tab, a backslash followed by a backslash means output a backslash... but there are lots of combinations that don't mean anything. A backslash followed by T for example is not a valid escape sequence. In C it would be a warning or an error. But the echo command tries to be more tolerant.

Try echo \\T or echo '\T' and you will discover that a backslash followed by anything that doesn't have a defined meaning as a backslash escape will just cause echo to output both characters as-is.

Which brings us to the last case: what if the backslash isn't followed by anything at all? What if it's the last character in the argument word? In that case, echo just outputs the backslash.

So in summary, two backslashes in the argument word result in one backslash in the output. But one backslash in the argument word also results in one backslash in the output, if it is the last character in the word or if the backslash together with the next character don't form a valid escape sequence.

The command line echo \\\\ thus becomes the word list

echo
\\

which outputs a single backslash "properly", with quoting applied at all levels.

The command line echo \\ becomes the word list

echo
\

which output a single backslash "messily", because echo found a stray backslash at the end of the argument and was generous enough to output it for you even though it wasn't escaped.

The rest of the examples should be clear from these principles.