Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are empty arrays treated as unset in bash?

Recently, I set up Microsoft's Windows Subsystem for Linux on my computer. It just emulates a Linux environment and stuff; basically, it's Cygwin, but a little better connected to the underlying Windows system. After switching from Cygwin to WSL, however, I ran into a problem. I don't know if it's particular to Windows' implementation or not, but this doesn't happen in Cygwin.

To catch bugs in my code a little faster, I've taken to using bash's set -u option, which causes the shell to "treat unset variables as an error when substituting." Without this, bash treats unset variables as variables set to the empty string when expanding them.

However, this has an odd unintended consequence (at least on WSL) with respect to arrays:

Me@Computer:~$ set -u
==>
Me@Computer:~$ declare -p array
==> bash: declare: array: not found
Me@Computer:~$ array=( )
==>
Me@Computer:~$ declare -p array
==> declare -a array='()'
Me@Computer:~$ echo "${array[@]}"       # Expands to "echo" (with 0 args), right?
==> bash: array[@]: unbound variable    # Wrong! wtf, bash??

As you can see from the output of declare -p array, bash does recognize the difference between array being empty and array being unset—until it comes time to actually expand it, whereupon bash throws a fit. I know bash treats the @ and * variables specially, and even more so when quoted, so I tried a bunch of stuff. Nothing works:

Me@Computer:~$ echo "${array[@]}"
==> bash: array[@]: unbound variable
Me@Computer:~$ echo "${array[*]}"
==> bash: array[*]: unbound variable
Me@Computer:~$ echo ${array[@]}
==> bash: array[@]: unbound variable
Me@Computer:~$ echo ${array[*]}
==> bash: array[*]: unbound variable

Oddly enough, I can access the array of indices of the array; however, bash then has the opposite problem in that it also succeeds when asked for the indices of an unset array:

Me@Computer:~$ echo "${!array[@]}"
==>
Me@Computer:~$ echo "${!unset_array[@]}"
==>

(The above works for all variations of the array expansion formats.)

Most frustratingly, I can't even access the length of an empty array:

Me@Computer:~$ echo "${#array[@]}"
==> bash: array[@]: unbound variable

This too fails with all of the variations of the format.

Does anyone know why this is happening? Is it a bug, or is this expected behavior? If it's the latter, what's the motivation? Are there any ways to disable this behavior that allow me to keep set -u?


Workaround(s):

I hit upon a really crappy workaround taking advantage of the fact that the positional parameters are immune to this phenomenon. If anyone finds a better one, please let me know!

Me@Computer:~$ tmp=( "$@" )                    # Stash the real positional params; we need that array
Me@Computer:~$ set --                          # "$@" is now empty.
Me@Computer:~$ example_cmd "${array[@]-$@}"    # Now expands w/out error *and* w/ the right number of args
Me@Computer:~$ set -- "${tmp-$@}"              # Put the positional params back where we found them
Me@Computer:~$ unset tmp                       # Cleaning up after ourselves

(Note that you still need to use trickery when resetting the positional parameters, just in case they themselves were originally empty.) These contortions would need to be performed every time a potentially empty array was used.


Other notes:

  • test -v also thinks empty arrays are unset, unlike declare -p.
  • The same problems occur with associative arrays.
  • I tried initializing the array with declare (i.e., declare -a array=( )), but that changed nothing.
  • The positional parameter arrays, thankfully, seem to be immune from this phenomenon.
  • I thought of just using "${array[@]-}" whenever I wanted to access an array, but this won't work in all scenarios. "${array[@]}", when double quoted, is supposed to expand as separate words for each array element; an empty array, then, should be expanded into 0 words (compare set -- "$@";echo $# with set -- "$*";echo $#). "${array[@]-}", however, expands into a single word, the empty string.

Version & environment info:

Like I said at the top, I'm using the Windows Subsystem for Linux on Windows 10. Other info:

Me@Computer:~$ bash --version
==> GNU bash, version 4.3.48(1)-release (x86_64-pc-linux-gnu)
    ...
Me@Computer:~$ echo "$-"
==> himuBCH
like image 282
greatBigDot Avatar asked Jan 23 '18 04:01

greatBigDot


Video Answer


1 Answers

This isn't specific to Bash running under WSL or not, but depends on the Bash version.

The behaviour has been reported as a bug for Bash 4.1, but was considered intended behaviour. Chet also points out that the different behaviour for $@ and $* is because POSIX mandates it. The recommended workaround back then, similar to Andy's comment, was:

echo ${argv[0]+"${argv[@]}"}

which expands to "${argv[@]}" if argv is set, and nothing otherwise (notice the outer expansion being unquoted).

In Bash 4.4, the behaviour changed, as documented in CHANGES, from bash-4.4-beta2 to bash-4.4-rc2, as a "new feature":

Using ${a[@]} or ${a[*]} with an array without any assigned elements when the nounset option is enabled no longer throws an unbound variable error.

like image 147
Benjamin W. Avatar answered Sep 28 '22 10:09

Benjamin W.