This code (it’s a part of a shell function) works perfectly:
    output=$(\
            cat "${vim_file}" | \
            sed -rne "${EXTRACT_ENTITIES}" | \
            sed -re "${CLEAR_LEADING_QUOTES}" | \
            sed -re "${NORMALIZE_NAMES}" \
    )
But when I’m trying to insert the word “local” before the assignment…
    local output=$(\
            cat "${vim_file}" | \
            sed -rne "${EXTRACT_ENTITIES}" | \
            sed -re "${CLEAR_LEADING_QUOTES}" | \
            sed -re "${NORMALIZE_NAMES}" \
    )
…I get a strange error:
local: commands.: bad variable name
There are no wrong invisible characters in the code: only tabs making indentations and spaces in the other places. The script begins with “#!/bin/sh”. Inserting the “local” before other variables in the function doesn’t lead to any problem. Replacing “output” (the name of the variable) with another arbitrary string changes nothing. The OS is Linux.
Really short answer: Use more quotes!
local output="$(\
        cat "${vim_file}" | \
        sed -rne "${EXTRACT_ENTITIES}" | \
        sed -re "${CLEAR_LEADING_QUOTES}" | \
        sed -re "${NORMALIZE_NAMES}" \
)"
Longer answer: It's almost always a good idea to double-quote variable references and command substitutions. Double-quoting prevents them from being subject to word splitting and filename wildcard expansion, which is rarely something you want, and can cause confusing problems.
There are situations where it's safe to leave the double-quotes off, but the rules are confusing and hard to remember, and easy to get wrong. This is one of those confusing cases. One of the situations where word splitting and wildcard expansion don't happen (and therefore it's safe to leave the double-quotes off) is on the right-hand side of an assignment:
var=$othervar           # safe to omit double-quotes
var2=$(somecommand)     # also safe
var="$othervar"          # this also works fine
var2="$(somecommand)"    # so does this
Some shells extend this to assignments that're part of a command, like local or export:
export var=$othervar         # *Maybe* ok, depending on the shell
local var2=$(somecommand)    # also *maybe* ok
bash treats these as a type of assignment, so it doesn't do the split-expand thing with the values. But dash treats this more like a regular command (where the arguments do get split-expanded), so if your script is running under dash it can have problems like this.
For example, suppose somecommand prints "export and local are shell commands." Then in dash, local var2=$(somecommand) would expand to:
local var2=export and local are shell commands.
...which would declare the local variables var2 (which gets set to "export"), and, local, are, and shell. It would also try to declare commands. as a local variable, but fail because it's not a legal variable name.
Therefore, use more quotes!
export var="$othervar"         # Safe in all shells
local var2="$(somecommand)"    # also safe
Or separate the declarations (or both!):
export var
var=$othervar         # Safe in all shells, with or without quotes
local var2
var2=$(somecommand)    # also safe, with or without quotes
The answer was found here: Advanced Bash-Scripting Guide. Chapter 24. Functions
This is a quotation from there:
As Evgeniy Ivanov points out, when declaring and setting a local variable in a single command, apparently the order of operations is to first set the variable, and only afterwards restrict it to local scope.
It means that if a local variable contains a space, then, trying to execute the local command, the shell will take only the first word for the assignment. The rest of the string will be interpreted dependently on the content.
The way the shell interprets the rest content is still a puzzle for me. In my case it tried to perform assignment using arbitrary parts of the files being read. For example, the “commands.” string in the error message was the end of a sentence in one of the files the cat command operated on.
So, there are two ways to solve the problem.
The first one is to split the assignment. I.e. instead of…
local output=$(cat ...
…it must be:
local output
output=$(cat ...
The second approach has been taken from the comments under the question — using surrounding quotes for the entire expression:
local output="$(cat...)"
Summarizing: using shell, we all must always remember about insidious splitting at spaces.
P.S. Read the brilliant explanation from Gordon Davisson.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With