In bash scripting, what is the best way to convert a string, containing literal quotes surrounding multiple words, into an array with the same result of parsed arguments?
Many questions exist all applying evasive tactics to avoid the problem instead of finding a solution, this question raises the following arguments and would like to encourage the reader to focus on arguments and if you are up for it, partake in the challenge to find the optimum solution.
Converting an existing script currently in use to receive parameters via named pipe or similar stream. In order to minimize the impact on the myriad of scripts outside of the developers control a decision was made to not change the interface. Existing scripts must be able to pass the same arguments via the new stream implementation as they did before.
$ ./string2array arg1 arg2 arg3
args=(
[0]="arg1"
[1]="arg2"
[2]="arg3"
)
$ echo "arg1 arg2 arg3" | ./string2array
args=(
[0]="arg1"
[1]="arg2"
[2]="arg3"
)
As pointed out by Bash and Double-Quotes passing to argv literal quotes are not parsed as would be expected.
This workbench script can be used to test various solutions, it handles the transport and formulates a measurable response. It is suggested that you focus on the solution script which gets sourced with the string as argument and you should populate the $args variable as an array.
#!/usr/bin/env bash
#string2arry
args=()
function inspect() {
local inspct=$(declare -p args)
inspct=${inspct//\[/\\n\\t[}; inspct=${inspct//\'/}; inspct="${inspct:0:-1}\n)"
echo -e ${inspct#*-a }
}
while read -r; do
# source the solution to turn $REPLY in $args array
source $1 "${REPLY}"
inspect
done
The solution for turning a string into a space delimited array of words worked for our first example above:
#solution1
args=($@)
Unfortunately the standard solution produces an undesired result for quoted multi word arguments:
$ echo 'arg1 "multi arg 2" arg3' | ./string2array solution1
args=(
[0]="arg1"
[1]="\"multi"
[2]="arg"
[3]="2\""
[4]="arg3"
)
Using the workbench script provide a solution snippet that will produce the following result for the arguments received.
$ echo 'arg1 "multi arg 2" arg3' | ./string2array solution-xyz
args=(
[0]="arg1"
[1]="multi arg 2"
[2]="arg3"
)
The solution should be compatible with standard argument parsing in every way. The following unit test should pass for for the provided solution. If you can think of anything currently missing from the unit test please leave a comment and we can update it.
Update: Test simplified and includes the Johnathan Leffer test
#!/usr/bin/env bash
#test_string2array
solution=$1
function test() {
cmd="echo \"${1}\" | ./string2array $solution"
echo "$ ${cmd}"
echo ${1} | ./string2array $solution > /tmp/t
cat /tmp/t
echo -n "Result : "
[[ $(cat /tmp/t|wc -l) -eq 7 ]] && echo "PASSED!" || echo "FAILED!"
}
echo 1. Testing single args
test 'arg1 arg2 arg3 arg4 arg5'
echo
echo 2. Testing multi args \" quoted
test 'arg1 "multi arg 2" arg3 "a r g 4" arg5'
echo
echo 3 Testing multi args \' quoted
test "arg1 'multi arg 2' arg3 'a r g 4' arg5"
echo
echo 4 Johnathan Leffer test
test "He said, \"Don't do that!\" but \"they didn't listen.\""
The declare
built-in seems to do what you want; in my test, it's your inspect
function that doesn't seem work to properly test all inputs:
# solution3
declare -a "args=($1)"
Then
$ echo "arg1 'arg2a arg2b' arg3" | while read -r; do
> source solution3 "${REPLY}"
> for arg in "${args[@]}"; do
> echo "Arg $((++i)): $arg"
> done
> done
Arg 1: arg1
Arg 2: arg2a arg2b
Arg 3: arg3
You may do it with declare
instead of eval
, for example:
Instead of:
string='"aString that may haveSpaces IN IT" bar foo "bamboo" "bam boo"'
echo "Initial string: $string"
eval 'for word in '$string'; do echo $word; done'
Do:
declare -a "array=($string)"
for item in "${array[@]}"; do echo "[$item]"; done
But please note, it is not much safer if input comes from user!
So, if you try it with say string like:
string='"aString that may haveSpaces IN IT" bar foo "bamboo" "bam boo" `hostname`'
You get hostname
evaluated (there off course may be something like rm -rf /
)!
Very-very simple attempt to guard it just replace chars like backtrick ` and $:
string='"aString that may haveSpaces IN IT" bar foo "bamboo" "bam boo" `hostname`'
declare -a "array=( $(echo $string | tr '`$<>' '????') )"
for item in "${array[@]}"; do echo "[$item]"; done
Now you got output like:
[aString that may haveSpaces IN IT]
[bar]
[foo]
[bamboo]
[bam boo]
[?hostname?]
More details about methods and pros about using different methods you may found in that good answer: Why should eval be avoided in Bash, and what should I use instead?
See also https://superuser.com/questions/1066455/how-to-split-a-string-with-quotes-like-command-arguments-in-bash/1186997#1186997
But there still leaved vector for attack. I very would have in bash method of string quote like in double quotes (") but without interpreting content.
Populate a variable with the combined words once the open quote was detected and only append to the array once the close quote arrives.
#solution2
j=''
for a in ${1}; do
if [ -n "$j" ]; then
[[ $a =~ ^(.*)[\"\']$ ]] && {
args+=("$j ${BASH_REMATCH[1]}")
j=''
} || j+=" $a"
elif [[ $a =~ ^[\"\'](.*)$ ]]; then
j=${BASH_REMATCH[1]}
else
args+=($a)
fi
done
$ ./test_string2array solution2
1. Testing single args
$ echo "arg1 arg2 arg3 arg4 arg5" | ./string2array solution2
args=(
[0]="arg1"
[1]="arg2"
[2]="arg3"
[3]="arg4"
[4]="arg5"
)
Result : PASSED!
2. Testing multi args " quoted
$ echo 'arg1 "multi arg 2" arg3 "a r g 4" arg5' | ./string2array solution2
args=(
[0]="arg1"
[1]="multi arg 2"
[2]="arg3"
[3]="a r g 4"
[4]="arg5"
)
Result : PASSED!
3 Testing multi args ' quoted
$ echo "arg1 'multi arg 2' arg3 'a r g 4' arg5" | ./string2array solution2
args=(
[0]="arg1"
[1]="multi arg 2"
[2]="arg3"
[3]="a r g 4"
[4]="arg5"
)
Result : PASSED!
So I think xargs actually works for all your test cases, eg:
echo 'arg1 "multi arg 2" arg3' | xargs -0 ./string2array
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With