Given arrays X and Y (preferably both as inputs, but otherwise, with one as input and the other hardcoded), how can I use jq to output the array containing all elements common to both? e.g. what is a value of f such that
echo '[1,2,3,4]' | jq 'f([2,4,6,8,10])'
would output
[2,4]
?
I've tried the following:
map(select(in([2,4,6,8,10]))) --> outputs [1,2,3,4]
select(map(in([2,4,6,8,10]))) --> outputs [1,2,3,4,5]
These complexity of all these answers obscured understanding the principle. That's unfortunate because the principle is simple:
- array1 minus array2 returns:
- everything that's left in array1
- after removing everything that is in array2
- (and discarding the rest of array2)
# From array1, subtract array2, leaving the remainder
$ jq --null-input '[1,2,3,4] - [2,4,6,8]'
[
1,
3
]
# Subtract the remainder from the original
$ jq --null-input '[1,2,3,4] - [1,3]'
[
2,
4
]
# Put it all together
$ jq --null-input '[1,2,3,4] - ([1,2,3,4] - [2,4,6,8])'
[
2,
4
]
comm
Demodef comm:
(.[0] - (.[0] - .[1])) as $d |
[.[0]-$d, .[1]-$d, $d]
;
With that understanding, I was able to imitate the behavior of the *nix comm
command
With no options, produce three-column output. Column one contains lines unique to FILE1, column two contains lines unique to FILE2, and column three contains lines common to both files.
$ echo 'def comm: (.[0]-(.[0]-.[1])) as $d | [.[0]-$d,.[1]-$d, $d];' > comm.jq
$ echo '{"a":101, "b":102, "c":103, "d":104}' > 1.json
$ echo '{ "b":202, "d":204, "f":206, "h":208}' > 2.json
$ jq --slurp '.' 1.json 2.json
[
{
"a": 101,
"b": 102,
"c": 103,
"d": 104
},
{
"b": 202,
"d": 204,
"f": 206,
"h": 208
}
]
$ jq --slurp '[.[] | keys | sort]' 1.json 2.json
[
[
"a",
"b",
"c",
"d"
],
[
"b",
"d",
"f",
"h"
]
]
$ jq --slurp 'include "comm"; [.[] | keys | sort] | comm' 1.json 2.json
[
[
"a",
"c"
],
[
"f",
"h"
],
[
"b",
"d"
]
]
$ jq --slurp 'include "comm"; [.[] | keys | sort] | comm[2]' 1.json 2.json
[
"b",
"d"
]
A simple and quite fast (but somewhat naive) filter that probably does essentially what you want can be defined as follows:
# x and y are arrays
def intersection(x;y):
( (x|unique) + (y|unique) | sort) as $sorted
| reduce range(1; $sorted|length) as $i
([]; if $sorted[$i] == $sorted[$i-1] then . + [$sorted[$i]] else . end) ;
If x is provided as input on STDIN, and y is provided in some other way (e.g. def y: ...
), then you could use this as: intersection(.;y)
Other ways to provide two distinct arrays as input include:
--slurp
option--arg a v
(or --argjson a v
if available in your jq)Here's a simpler but slower def that's nevertheless quite fast in practice:
def i(x;y):
if (y|length) == 0 then []
else (x|unique) as $x
| $x - ($x - y)
end ;
Here's a standalone filter for finding the intersection of arbitrarily many arrays:
# Input: an array of arrays
def intersection:
def i(y): ((unique + (y|unique)) | sort) as $sorted
| reduce range(1; $sorted|length) as $i
([]; if $sorted[$i] == $sorted[$i-1] then . + [$sorted[$i]] else . end) ;
reduce .[1:][] as $a (.[0]; i($a)) ;
Examples:
[ [1,2,4], [2,4,5], [4,5,6]] #=> [4]
[[]] #=> []
[] #=> null
Of course if x
and y
are already known to be sorted and/or unique, more efficient solutions are possible. See in particular Finite Sets of JSON Entities
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With