I know the difference in purpose between parentheses ()
and curly braces {}
when grouping commands in bash.
But why does the curly brace construct require a semicolon after the last command, whereas for the parentheses construct, the semicolon is optional?
$ while false; do ( echo "Hello"; echo "Goodbye"; ); done $ while false; do ( echo "Hello"; echo "Goodbye" ); done $ while false; do { echo "Hello"; echo "Goodbye"; }; done $ while false; do { echo "Hello"; echo "Goodbye" }; done bash: syntax error near unexpected token `done' $
I'm looking for some insight as to why this is the case. I'm not looking for answers such as "because the documentation says so" or "because it was designed that way". I'd like to know why it was designed this is way. Or maybe if it is just a historical artifact?
This may be observed in at least the following versions of bash:
Semicolons go at the end of lines that do not end in a curly brace or to separate statements on the same line. It does no harm to use them after a closing brace, or to wear suspenders and a belt, but it does look a little nerdy.
The double semicolon is also useful as it leaves no ambiguity in the code. It is required as it is used at the end of each clause as required by the bash syntax in order to parse the command correctly. It is only used in case constructs to indicate that the end of an alternative.
Actually bash treats ; as a linebreak. So your can either use a semicolon between natural1 and do, or put do on the next line. Both would be valid syntax. A lot of programmers use the semicolon because they find it neater to contain the loop declaration to a single line.
The curly braces tell the shell interpreter where the end of the variable name is.
Because {
and }
are only recognized as special syntax if they are the first word in a command.
There are two important points here, both of which are found in the definitions section of the bash manual. First, is the list of metacharacters:
metacharacter
A character that, when unquoted, separates words. A metacharacter is a blank or one of the following characters: ‘|’, ‘&’, ‘;’, ‘(’, ‘)’, ‘<’, or ‘>’.
That list includes parentheses but not braces (neither curly nor square). Note that it is not a complete list of characters with special meaning to the shell, but it is a complete list of characters which separate tokens. So {
and }
do not separate tokens, and will only be considered tokens themselves if they are adjacent to a metacharacter, such as a space or a semi-colon.
Although braces are not metacharacters, they are treated specially by the shell in parameter expansion (eg. ${foo}
) and brace expansion (eg. foo.{c,h}
). Other than that, they are just normal characters. There is no problem with naming a file {ab}
, for example, or }{
, since those words do not conform to the syntax of either parameter expansion (which requires a $
before the {
) or brace expansion (which requires at least one comma between {
and }
). For that matter, you could use {
or }
as a filename without ever having to quote the symbols. Similarly, you can call a file if
, done
or time
without having to think about quoting the name.
These latter tokens are "reserved words":
reserved word
A word that has a special meaning to the shell. Most reserved words introduce shell flow control constructs, such as
for
andwhile
.
The bash manual doesn't contain a complete list of reserved words, which is unfortunate, but they certainly include the Posix-designated:
! { }
case do done elif else
esac fi for if in
then until while
as well as the extensions implemented by bash (and some other shells):
[[ ]]
function select time
These words are not the same as built-ins (such as [
), because they are actually part of the shell syntax. The built-ins could be implemented as functions or shell scripts, but reserved words cannot because they change the way that the shell parses the command line.
There is one very important feature of reserved words, which is not actually highlighted in the bash manual but is made very explicit in Posix (from which the above lists of reserved words were taken, except for time
):
This recognition [as a reserved word] shall only occur when none of the characters is quoted and when the word is used as:
- The first word of a command …
(The full list of places where reserved words is recognized is slightly longer, but the above is a pretty good summary.) In other words, reserved words are only reserved when they are the first word of a command. And, since {
and }
are reserved words, they are only special syntax if they are the first word in a command.
Example:
ls } # } is not a reserved word. It is an argument to `ls`
ls;} # } is a reserved word; `ls` has no arguments
There is lots more I could write about shell parsing, and bash parsing in particular, but it would rapidly get tedious. (For example, the rule about when #
starts a comment and when it is just an ordinary character.) The approximate summary is: "don't try this at home"; really, the only thing which can parse shell commands is a shell. And don't try to make sense of it: it's just a random collection of arbitrary choices and historical anomalies, many but not all based on the need to not break ancient shell scripts with new features.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With