Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get length of an empty or unset array when “nounset” option is in effect

Tags:

bash

Due to the fact that Bash, when running in set -o nounset mode (aka set -u), may consider empty arrays as unset regardless of whether they have actually been assigned an empty value, care must be taken when attempting to expand an array — one of the workarounds is to check whether the array length is zero. Not to mention that getting the number of elements in an array is a common operation by itself.

While developing with Bash 4.2.47(1)-release in openSUSE 42.1, I accustomed to that getting array size with ${#ARRAY_NAME[@]} succeeds when array is either empty or unset. However, while checking my script with Bash 4.3.46(1)-release in FreeBSD 10.3, it turned out that this operation may fail with generic “unbound variable” error message. Providing default value for expansion does not seem to work for array length. Providing alternative command chains seems to work, but not inside a function called through a subshell expansion — functions just exits after the first failure. What else can be of any help here?

Consider the following example:

function Size ()
{
    declare VAR="$1"
    declare REF="\${#${VAR}[@]}"
    eval "echo \"${REF}\" || echo 0" 2>/dev/null || echo 0
}

set -u
declare -a MYARRAY

echo "size: ${#MYARRAY[@]}"
echo "size: ${#MYARRAY[@]-0}"
echo "Size: $(Size 'MYARRAY')"
echo -n "Size: "; Size 'MYARRAY'

In openSUSE environment, all echo lines output 0, as expected. In FreeBSD, the same outcome is only possible when the array is explicitly assigned an empty value: MYARRAY=(); otherwise, both inline queries in the first two lines fail, the third line just outputs Size: (meaning that the expansion result is empty), and only the last line succeeds completely thanks to the outer || echo 0 — however passing the result through to the screen is not what is usually intended when trying to obtain array length.

Here is the summary of my observations:

                                    Bash 4.2  Bash 4.3
                                    openSUSE  FreeBSD

counting elements of unset array       OK      FAILED
counting elements of empty array       OK        OK

content expansion of unset array     FAILED    FAILED
content expansion of unset array(*)    OK        OK
content expansion of empty array     FAILED    FAILED
content expansion of empty array(*)    OK        OK
    (* with fallback value supplied)

To me, that looks pretty inconsistent. Is there any real future-proof and cross-platform solution for that?

like image 633
Anton Samsonov Avatar asked Oct 30 '22 00:10

Anton Samsonov


2 Answers

There are known (documented) differences between the Linux and BSD flavors of bash. I would suggest writing your code as per the POSIX standard. You can start here for more information -> www2.opengroup.org.

With that in mind, you can start bash with the --posix command-line option or you can execute the command set -o posix while bash is running. Either will cause bash to conform to the POSIX standard.

The above suggestion will increase the probability of cross-platform consistency.

like image 87
tale852150 Avatar answered Nov 15 '22 10:11

tale852150


As a temporary solution, I followed the route suggested by @william-pursell and just unset the nounset option during the query:

function GetArrayLength ()
{
    declare ARRAY_NAME="$1"
    declare INDIRECT_REFERENCE="\${#${ARRAY_NAME}[@]}"
    case "$-" in
    *'u'*)
        set +u
        eval "echo \"${INDIRECT_REFERENCE}\""
        set -u
        ;;
    *)
        eval "echo \"${INDIRECT_REFERENCE}\""
        ;;
    esac
}

(Using if instead of case leads to negligibly slower execution on my test machines. Moreover, case allows matching additional options easily if that would become necessary sometime.)

I also tried exploiting the fact that content expansion (with fallback or replacement value) usually succeeds even for unset arrays:

function GetArrayLength ()
{
    declare ARRAY_NAME="$1"
    declare INDIRECT_REFERENCE="${ARRAY_NAME}[@]"
    if [[ -z "${!INDIRECT_REFERENCE+isset}" ]]; then
        echo 0
    else
        INDIRECT_REFERENCE="\${#${ARRAY_NAME}[@]}"
        eval "echo \"${INDIRECT_REFERENCE}\""
    fi
}

However, it turns out that Bash does not optimize ${a[@]+b} expansion, as execution time clearly increases for larger arrays — although being the smallest one for empty or unset arrays.

Nevertheless, if anyone has a better solution, fell free to post other answers.

like image 22
Anton Samsonov Avatar answered Nov 15 '22 11:11

Anton Samsonov