Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merge json arrays with duplicate keys

I want to merge two json arrays with help of jq. Each object in arrays contains name field, which allow me to group by and merge two arrays into one.

LABELS

[
  {
    "name": "power_branch",
    "description": "master"
  },
  {
    "name": "test_branch",
    "description": "main"
  }
]

RUNNERS

[
  {
    "name": "power_branch",
    "runner": "power",
    "runner_tag": "macos"
  },
  {
    "name": "power_branch",
    "runner": "power",
    "runner_tag": "ubuntu"
  },
  {
    "name": "test_branch",
    "runner": "tester",
    "runner_tag": ""
  },
  {
    "name": "development",
    "runner": "dev",
    "runner_tag": "ubuntu"
  }
]

Desired Output

[
  {
    "name": "power_branch",
    "description": "master",
    "runner": "power",
    "runner_tag": "macos"
  },
  {
    "name": "power_branch",
    "description": "master",
    "runner": "power",
    "runner_tag": "ubuntu"
  },
  {
    "name": "test_branch",
    "description": "main",
    "runner": "tester",
    "runner_tag": ""
  }
]

I tried with following script, but power_branch entry was override, instead i want another entry with different runner_tag

#!/usr/bin/bash

LABELS='[{"name": "power_branch","description": "master"},{"name": "test_branch","description": "main"}]'
RUNNERS='''
[
  { "name": "power_branch", "runner": "power", "runner_tag": "macos" },
  { "name": "power_branch", "runner": "power", "runner_tag": "ubuntu" },
  { "name": "test_branch", "runner": "tester", "runner_tag": "" },
  { "name": "development", "runner": "dev", "runner_tag": "ubuntu" }
]
'''

FINAL=$(jq -s '[ .[0] + .[1] | group_by(.name)[] | select(length > 1) | add]' <(echo $LABELS) <(echo $RUNNERS))
echo $FINAL

OUTPUT

[
  {
    "name": "power_branch",
    "description": "master",
    "runner": "power",
    "runner_tag": "ubuntu"
  },
  {
    "name": "test_branch",
    "description": "main",
    "runner": "tester",
    "runner_tag": ""
  }
]
like image 965
Kuba Ronkiewicz Avatar asked Jan 03 '22 14:01

Kuba Ronkiewicz


People also ask

Can JSON array have duplicate keys?

We can have duplicate keys in a JSON object, and it would still be valid. The validity of duplicate keys in JSON is an exception and not a rule, so this becomes a problem when it comes to actual implementations.

Can a JSON object contain duplicate keys?

I've done some research and I understood that "duplicate" keys in JSON are legal, but different parsers act differently in handling this. How can I avoid the use of duplicate keys or, alternatively, make sure the schema validates the value for all duplicated keys? Hermann.

Can JSON have multiple keys with same name?

There is no "error" if you use more than one key with the same name, but in JSON, the last key with the same name is the one that is going to be used. In your case, the key "name" would be better to contain an array as it's value, instead of having a number of keys "name".

How do I combine two JSON responses?

simple. JSONObject to merge two JSON objects in Java. We can merge two JSON objects using the putAll() method (inherited from interface java.


1 Answers

If you have two files labels.json and runners.json, you could read in the latter (runners) as a variable using --argjson and append to each element of the input array (labels) using map the corresponding fields determined by select.

jq --argjson runners "$(cat runners.json)" '
  map(.name as $name | . + ($runners[] | select(.name == $name)))
' labels.json

However, this reads the whole runners array into your shells command line space (--argjson takes two strings: a name and a value) which can easily overflow if the runners array gets big enough.

Therefore, instead of using command substitution "$(…)", you could read in the runners file directly using either --slurpfile for the cost of another iteration level [][], or (despite the manual saying not to - read more about it in the comments) using --argfile with just a single iteration level as before:

jq --slurpfile runners runners.json '
  map(.name as $name | . + ($runners[][] | select(.name == $name)))
' labels.json
jq --argfile runners runners.json '
  map(.name as $name | . + ($runners[] | select(.name == $name)))
' labels.json

To circumvent all these issues, @peak suggested using input for each file together with the -n option. Note that this requires the two files to be provided in this exact order as they are being read in sequentially.

jq -n 'input as $runners | input |
  map(.name as $name | . + ($runners[] | select(.name == $name)))
' runners.json labels.json

As the second input (labels) is passed on directly as the filter's main input (in contrast to runners, which is stored in a variable for later use), this could be further simplified by removing again the -n option (order of the files still matters):

jq 'input as $runners |
  map(.name as $name | . + ($runners[] | select(.name == $name)))
' runners.json labels.json

Finally, here's yet another approach using the SQL-style operators INDEX and JOIN which were introduced in jq v1.6. This also employs the technique using just one input and also the order of the files still matters as we need the runners array as the filter's primary input.

jq '
  JOIN(INDEX(input[]; .name); .name) | map(select(.[1]) | add)
' runners.json labels.json
like image 168
pmf Avatar answered Oct 11 '22 10:10

pmf