Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

bash: Iterating over members of a JSON array selected by index

I'm using jq to parse a JSON file, extracting each JSON array in a series into a shell array.

My current code looks like the following:

for ((i = 0; i < ${#nvars[@]}; i++)); do
    v1=($(cat $INPUT | jq '."config"[i]."var1"[]'))
    echo $v1
done

error message:

error: i is not defined

I also replaced

v1=($(cat $INPUT | jq '."config"[i]."var1"[]'))

with

v1=($(cat $INPUT | jq '."config"[$i]."var1"[]'))

still not working. Any idea? Any help is appreciated!


Edit: Sample Input Data

{
    "config-vars":[
        {
            "var1":["v1","v2"],
            "var2":""
        },
        {
            "var1":["v3",""],
            "var2":"v4"
        }
    ]
}
like image 787
odieatla Avatar asked Jan 16 '15 21:01

odieatla


4 Answers

There's a fair bit of room for improvement. Let's start here:

v1=($(cat $INPUT | jq '."config"[$i]."var1"[]'))

...first, you don't actually need to use cat; it's slowing your performance, because it forces jq to read from a pipe rather than from your input file directly. Just running jq <"$INPUT" would be more robust (or, better, <"$input", to avoid using all-uppercase names, which are reserved by convention for shell builtins and environment variables).

Second, you need to quote all variable expansions, including the expansion of the input file's name -- otherwise, you'll get bugs whenever your filename contains spaces.

Third, array=( $(stuff) ) splits the output of stuff on all characters in IFS, and expands the results of that splitting as a series of glob expressions (so if the output contains *.txt, and you're running this script in a directory that contains text files, you get the names of those files in your result array). Splitting on newlines only would mean you could correctly parse multi-word strings, and disabling glob expansion is necessary before you can use this technique reliably in the presence of glob characters. One way to do this is to set IFS=$'\n' and run set -h before running this command; another is to redirect the output of your command into a while read loop (shown below).

Fourth, string substitution into code is bad practice in any language -- that way lies (local equivalents to) Bobby Tables, allowing someone who's supposed to be able to only change the data passed into your process to provide content which is processed as executable code (albeit, in this case, as a jq script, which is less dangerous than arbitrary code execution in a more full-featured language; still, this can allow extra data to be added to the output).

Next, once you're getting jq to emit newline-separated content, you don't need to read it into an array at all: You can iterate over the content as it's written from jq and read into your shell, thus preventing the shell from needing to allocate memory to buffer that content:

while IFS= read -r; do
  echo "read content from jq: $REPLY"
done < <(jq -r --arg i "$i" '.config[$i | tonumber].var1[]' <"$input")

Finally -- let's say you do want to work with an array. There are two ways to do this that avoid pitfalls. One is to set IFS explicitly and disable glob expansion before the assignment:

IFS=$'\n' # split only on newlines
set -f
result=( $(jq -r ... <"$input") )

The other is to assign to your array with a loop:

result=( )
while IFS= read -r; do
  result+=( "$REPLY" )
done < <(jq -r ... <"$input")

...or, as suggested by @JohnKugelman, to use read -a to read the whole array in one operation:

IFS=$'\n' read -r -d '' -a result < <(jq -r ... <"$input")
like image 73
Charles Duffy Avatar answered Nov 08 '22 09:11

Charles Duffy


Variables aren't interpolated inside single quotes. Use double quotes instead, and remove the existing quotes.

v1=($(cat $INPUT | jq ".config[$i].var1[]"))

Or use the --arg option and then you can stick with single quotes.

v1=($(cat $INPUT | jq --arg i "$i" '.config[$i].var1[]'))

You could also fix the useless use of cat:

v1=($(jq ".config[$i].var1[]" "$INPUT"))

Also, see @CharlesDuffy's answer for a great, detailed explanation of why assigning to array like this is unsafe.

like image 36
John Kugelman Avatar answered Nov 08 '22 09:11

John Kugelman


If you have already stored the result of some JSON into a variable called $MY_VAR:

while IFS= read -r; do
  echo "$REPLY"
done < <(echo "$MY_VAR" | jq -r '.[]')
like image 2
Craig Avatar answered Nov 08 '22 11:11

Craig


jq is capable of extracting the structure in one go, so the entire loop is superfluous. If the input JSON contains more records than you have values in nvars, use the index to chop.

jq -r '."config-vars"[]."var1"' "$INPUT" |
head -n "${#nvars[@]}"  # If you need just the #nvars first values
like image 1
tripleee Avatar answered Nov 08 '22 11:11

tripleee