I often see this construct in sh scripts:
if [ "z$x" = z ]; then echo x is empty; fi
Why don't they just write it like this?
if [ "$x" = "" ]; then echo x is empty; fi
TL;DR short answer
In this construct:
if [ "z$x" = z ]; then echo x is empty; fi
the z
is a guard against funny content of $x
and many other problems.
If you write it without the z
:
if [ "$x" = "" ]; then echo x is empty; fi
and $x
contains the string -x
you will get:
if [ "-x" = "" ]; then echo x is empty; fi
and that confuses the hell out of some older implementations of [
.
If you further omit the quotes around $x
and $x
contains the string -f foo -o x
you will get:
if [ -f foo -o x = "" ]; then echo x is empty; fi
and now it silently checks for something completely different.
the guard will prevent these maybe honest human errors maybe possibly malicious attacks to fall through silently. with the guard you either get the correct result or an error message. read on for an elaborate explanation.
Elaborate explanation
The z
in
if [ "z$x" = z ]; then echo x is empty; fi
is called a guard.
To explain why you want the guard I first want to explain the syntax of the bash conditional if
. It is important to understand that [
is not part of the syntax. It is a command. It is an alias to the test
command. And in most current shells it is a builtin command.
The grammar rule for if
is roughly as follows:
if command; then morecommands; else evenmorecommands; fi
(the else
part is optional)
command
can be any command. Really any command. What bash does when it encounters an if
is roughly as follows:
command
.command
.0
then execute morecommands
. If exit status is anything else, and the else
part exists, then execute evenmorecommands
.Let's try that:
$ if true; then echo yay; else echo boo; fi
yay
$ if wat; then echo yay; else echo boo; fi
bash: wat: command not found
boo
$ if echo foo; then echo yay; else echo boo; fi
foo
yay
$ if cat foo; then echo yay; else echo boo; fi
cat: foo: No such file or directory
boo
Let's try the test
command:
$ if test z = z; then echo yay; else echo boo; fi
yay
And the alias [
:
$ if [ z = z ]; then echo yay; else echo boo; fi
yay
You see [
is not part of the syntax. It is just a command.
Note that the z
here has no special meaning. It is just a string.
Let's try the [
command outside of an if
:
$ [ z = z ]
Nothing happens? It returned an exit status. You can check the exit status with echo $?
.
$ [ z = z ]
$ echo $?
0
Let's try unequal strings:
$ [ z = x ]
$ echo $?
1
Because [
is a command it accepts parameters just like any other commands. In fact, the closing ]
is also a parameter, a mandatory parameter which must come last. If it is missing the command will complain:
$ [ z = z
bash: [: missing `]'
It is misleading that bash does the complaining. Actually the builtin command [
does the complaining. We can see more clearly who does the complaining when we invoke the system [
:
$ /usr/bin/[ z = z
/usr/bin/[: missing `]'
Interestingly the system [
doesn't always insist on a closing ]
:
$ /usr/bin/[ --version
[ (GNU coreutils) 7.4
...
You need a space before the closing ]
otherwise it will not be recognized as a parameter:
$ [ z = z]
bash: [: missing `]'
You also need a space after the [
otherwise bash will think you want to execute another command:
$ [z = z]
bash: [z: command not found
This is much more obvious when you use test
:
$ testz = z
bash: testz: command not found
Remember [
is just another name for test
.
[
can do more than just compare strings. It can compare numbers:
$ [ 1 -eq 1 ]
$ [ 42 -gt 0 ]
It can also check for the existence of files or directories:
$ [ -f filename ]
$ [ -d dirname ]
See help [
or man [
for more information about the capabilities of [
(or test
). man
will show you the documentation for the system command. help
will show you the documentation for the bash builtin command.
Now that I have covered the bases I can answer your question:
Why do people write this:
if [ "z$x" = z ]; then echo x is empty; fi
and not this:
if [ "$x" = "" ]; then echo x is empty; fi
For brevity I will strip off the if
because this is only about [
.
The z
in this construct:
[ "z$x" = z ]
is a guard against funny content of $x
in combination with older implementations of [
, and/or a guard against human error like forgetting to quote $x
.
What happens when $x
has funny content like -f
?
This
[ "$x" = "" ]
will become
[ "-f" = "" ]
Some older implementations of [
will get confused when the first parameter starts with a -
. The z
will make sure that the first parameter never starts with a -
regardless of content of $x
.
[ "z$x" = "z" ]
will become
[ "z-f" = "z" ]
What happens when you forgot to quote $x
? Funny content like -f foo -o x
can change the entire meaning of the test.
[ $x = "" ]
will become
[ -f foo -o x = "" ]
The test is now checking for the existence of the file foo and then logical or with whether x
is the empty string. The worst part is that you won't even notice because there is no error message, only an exit status. If $x
comes from user input this can even be used for malicious attacks.
With the guarding z
[ z$x = z ]
will become
[ z-f foo -o x = z ]
At least you will now get an error message:
$ [ z-f foo -o x = z ]; echo $?
bash: [: too many arguments
The guard also helps against the case of undefined variable instead of the empty string. Some older shells had different behaviour for undefined variable and empty string. This problem is basically solved because in modern shells undefined mostly behaves like an empty string.
Summary:
The quote around $x
helps to make the undefined cases behave more like the empty string cases.
The guard before $x
helps to further prevent all the other problems mentioned above.
The guard before $x
will prevent all these possible errors:
$x
(code injection by malicious user)[
(getting confused if string begins with -
)$x
(will allow -f foo -o x
to subvert the meaning of the test)$x
. (older implementations behave differently if undefined)The guard will either do the right thing or raise an error message.
Modern implementations of [
have fixed some of the problems and modern shells have some solutions for the other cases, but they have pitfalls of their own. The guarding z
is not necessary if you are otherwise carefull, but it makes avoiding mistakes while writing simple tests so much more simpler.
See also:
For testing zero length, use -z
:
if [ -z "$x" ] ; then
echo x is empty
fi
With bash, you can use its [[
that does not need quotes:
if [[ -z $x ]] ; then
echo x is empty
fi
I just found the following in man 1p sh
, the documentation of POSIX shell:
Historical systems have also been unreliable given the common construct:
test "$response" = "expected string"
One of the following is a more reliable form:
test "X$response" = "Xexpected string" test "expected string" = "$response"
Note that the second form assumes that expected string could not be confused with any unary primary. If expected string starts with '-', '(', '!', or even '=', the first form should be used instead.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With