Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parse JSON to array in a shell script

I'm trying to parse a JSON object within a shell script into an array.

e.g.: [Amanda, 25, http://mywebsite.com]

The JSON looks like:

{
  "name"       : "Amanda", 
  "age"        : "25",
  "websiteurl" : "http://mywebsite.com"
}

I do not want to use any libraries, it would be best if I could use a regular expression or grep. I have done:

myfile.json | grep name

This gives me "name" : "Amanda". I could do this in a loop for each line in the file, and add it to an array but I only need the right side and not the entire line.

like image 675
unconditionalcoder Avatar asked Jul 14 '16 01:07

unconditionalcoder


People also ask

What does JQ do in bash?

jq command is used not only for reading JSON data but also to display data by removing the particular key. The following command will print all key values of Students. json file by excluding batch key. map and del function are used in jq command to do the task.

How JSON array looks like?

A JSON array contains zero, one, or more ordered elements, separated by a comma. The JSON array is surrounded by square brackets [ ] . A JSON array is zero terminated, the first index of the array is zero (0). Therefore, the last index of the array is length - 1.


1 Answers

If you really cannot use a proper JSON parser such as jq[1] , try an awk-based solution:

Bash 4.x:

readarray -t values < <(awk -F\" 'NF>=3 {print $4}' myfile.json)

Bash 3.x:

IFS=$'\n' read -d '' -ra values < <(awk -F\" 'NF>=3 {print $4}' myfile.json)

This stores all property values in Bash array ${values[@]}, which you can inspect with
declare -p values.

These solutions have limitations:

  • each property must be on its own line,
  • all values must be double-quoted,
  • embedded escaped double quotes are not supported.

All these limitations reinforce the recommendation to use a proper JSON parser.


Note: The following alternative solutions use the Bash 4.x+ readarray -t values command, but they also work with the Bash 3.x alternative, IFS=$'\n' read -d '' -ra values.

grep + cut combination: A single grep command won't do (unless you use GNU grep - see below), but adding cut helps:

readarray -t values < <(grep '"' myfile.json | cut -d '"' -f4)

GNU grep: Using -P to support PCREs, which support \K to drop everything matched so far (a more flexible alternative to a look-behind assertion) as well as look-ahead assertions ((?=...)):

readarray -t values < <(grep -Po ':\s*"\K.+(?="\s*,?\s*$)' myfile.json)

Finally, here's a pure Bash (3.x+) solution:

What makes this a viable alternative in terms of performance is that no external utilities are called in each loop iteration; however, for larger input files, a solution based on external utilities will be much faster.

#!/usr/bin/env bash

declare -a values # declare the array                                                                                                                                                                  

# Read each line and use regex parsing (with Bash's `=~` operator)
# to extract the value.
while read -r line; do
  # Extract the value from between the double quotes
  # and add it to the array.
  [[ $line =~ :[[:blank:]]+\"(.*)\" ]] && values+=( "${BASH_REMATCH[1]}" )
done < myfile.json                                                                                                                                          

declare -p values # print the array

[1] Here's what a robust jq-based solution would look like (Bash 4.x):
readarray -t values < <(jq -r '.[]' myfile.json)

like image 146
mklement0 Avatar answered Oct 14 '22 12:10

mklement0