@Jeff Mercado blew my mind! I didn't know array subtraction was allowed...
echo -n '{"all":["A","B","C","ABC"],"some":["B","C"]}' | jq '.all-.some'
yields
[
"A",
"ABC"
]
Answer from Jon on Stack Overflow@Jeff Mercado blew my mind! I didn't know array subtraction was allowed...
echo -n '{"all":["A","B","C","ABC"],"some":["B","C"]}' | jq '.all-.some'
yields
[
"A",
"ABC"
]
I was looking for a similar solution but with the requirement that the arrays were being generated dynamically. Below solution just does the expected
array1=$(jq -e '') // jq expression goes here
array2=$(jq -e '') // jq expression goes here
array_diff=$(jq -n --argjson array1 "$array1" --argjson array2 "$array2"
'{"all": $array1,"some":$array2} | .all-.some' )
linux - How to merge arrays from multiple arrays in two JSON with JQ - Unix & Linux Stack Exchange
diff - Using jq or alternative command line tools to compare JSON files - Stack Overflow
jq - Merge JSON arrays on dissimilar keys - Unix & Linux Stack Exchange
how to get the intersection of two JSON arrays using jq - Stack Overflow
Assuming the top-most keys of all documents are always the same across all documents, extract the keys into a separate variable, then reduce (accumulate) the data over these keys.
jq -s '
(.[0] | keys[]) as $k |
reduce .[] as $item (null; .[$k] += $item[$k])' file*.json
Note the use of -s to read all the input into a single array.
This, more or less, iterates over the keys Lists1 and Lists2 for each document, accumulating the data in a new structure (null from the start).
Assuming that the input JSON documents are well-formed:
{
"Lists1": [{"point":"a","coordinates":[2289.48096,2093.48096]}],
"Lists2": [{"point":"b","coordinates":[2289.48096,2093.48096]}]
}
{
"Lists1": [{"point":"c","coordinates":[2289.48096,2093.48096]}],
"Lists2": [{"point":"d","coordinates":[2289.48096,2093.48096]}]
}
You will get the following resulting document containing two objects:
{
"Lists1": [{"point":"a","coordinates":[2289.48096,2093.48096]},{"point":"c","coordinates":[2289.48096,2093.48096]}]
}
{
"Lists2": [{"point":"b","coordinates":[2289.48096,2093.48096]},{"point":"d","coordinates":[2289.48096,2093.48096]}]
}
Would you want the two keys in the same object:
jq -s '
[ (.[0] | keys[]) as $k |
reduce .[] as $item (null; .[$k] += $item[$k]) ] | add' file*.json
If the keys are not always the same across the document this one will do the job:
jq --slurp '
reduce (.[] | to_entries | .[]) as {$key, $value} (
{};
.[$k] += $v
)
' file*.json
Given these two files:
{
"Lists1": [{"point":"a","coordinates":"..."],
"Lists2": [{"point":"b","coordinates":"..."}]
}
{
"Lists1": [{"point":"c","coordinates":"..."}],
"Lists2": [{"point":"d","coordinates":"..."}],
"Lists3": [{"point":"e","coordinates":"..."}]
}
the output is:
{
"Lists1":[{"point":"a","coordinates":"..."},{"point":"c","coordinates":"..."}],
"Lists2":[{"point":"b","coordinates":"..."},{"point":"d","coordinates":"..."}],
"Lists3":[{"point":"e","coordinates":"..."}]
}
If your shell supports process substitution (Bash-style follows, see docs):
diff <(jq --sort-keys . A.json) <(jq --sort-keys . B.json)
Objects key order will be ignored, but array order will still matter. It is possible to work-around that, if desired, by sorting array values in some other way, or making them set-like (e.g. ["foo", "bar"] → {"foo": null, "bar": null}; this will also remove duplicates).
Alternatively, substitute diff for some other comparator, e.g. cmp, colordiff, or vimdiff, depending on your needs. If all you want is a yes or no answer, consider using cmp and passing --compact-output to jq to not format the output for a potential small performance increase.
Use jd with the -set option:
No output means no difference.
$ jd -set A.json B.json
Differences are shown as an @ path and + or -.
$ jd -set A.json C.json
@ ["People",{}]
+ "Carla"
The output diffs can also be used as patch files with the -p option.
$ jd -set -o patch A.json C.json; jd -set -p patch B.json
{"City":"Boston","People":["John","Carla","Bryan"],"State":"MA"}
https://github.com/josephburnett/jd#command-line-usage
NOTE: This solution assumes array1 has no duplicates.
Simple Explanation
The complexity of all these answers obscures understanding the principle. That's unfortunate because the principle is simple:
- array1 minus array2 returns:
- everything that's left in array1
- after removing everything that is in array2
- (and discarding the rest of array2)
Simple Demo
# From array1, subtract array2, leaving the remainder
$ jq --null-input '[1,2,3,4] - [2,4,6,8]'
[
1,
3
]
# Subtract the remainder from the original
$ jq --null-input '[1,2,3,4] - [1,3]'
[
2,
4
]
# Put it all together
$ jq --null-input '[1,2,3,4] - ([1,2,3,4] - [2,4,6,8])'
[
2,
4
]
comm Demo
def comm:
(.[0] - (.[0] - .[1])) as $d |
[.[0]-$d, .[1]-$d, $d]
;
With that understanding, I was able to imitate the behavior of the *nix comm command
With no options, produce three-column output. Column one contains lines unique to FILE1, column two contains lines unique to FILE2, and column three contains lines common to both files.
$ echo 'def comm: (.[0]-(.[0]-.[1])) as $d | [.[0]-$d,.[1]-$d, $d];' > comm.jq
$ echo '{"a":101, "b":102, "c":103, "d":104}' > 1.json
$ echo '{ "b":202, "d":204, "f":206, "h":208}' > 2.json
$ jq --slurp '.' 1.json 2.json
[
{
"a": 101,
"b": 102,
"c": 103,
"d": 104
},
{
"b": 202,
"d": 204,
"f": 206,
"h": 208
}
]
$ jq --slurp '[.[] | keys | sort]' 1.json 2.json
[
[
"a",
"b",
"c",
"d"
],
[
"b",
"d",
"f",
"h"
]
]
$ jq --slurp 'include "comm"; [.[] | keys | sort] | comm' 1.json 2.json
[
[
"a",
"c"
],
[
"f",
"h"
],
[
"b",
"d"
]
]
$ jq --slurp 'include "comm"; [.[] | keys | sort] | comm[2]' 1.json 2.json
[
"b",
"d"
]
A simple and quite fast (but somewhat naive) filter that probably does essentially what you want can be defined as follows:
# x and y are arrays
def intersection(x;y):
( (x|unique) + (y|unique) | sort) as $sorted
| reduce range(1; $sorted|length) as $i
([]; if $sorted[$i] == $sorted[$i-1] then . + [$sorted[$i]] else . end) ;
If x is provided as input on STDIN, and y is provided in some other way (e.g. def y: ...), then you could use this as: intersection(.;y)
Other ways to provide two distinct arrays as input include:
- using the
--slurpoption - using
--arg a v(or--argjson a vif available in your jq)
Here's a simpler but slower def that's nevertheless quite fast in practice:
def i(x;y):
if (y|length) == 0 then []
else (x|unique) as $x
| $x - ($x - y)
end ;
Here's a standalone filter for finding the intersection of arbitrarily many arrays:
# Input: an array of arrays
def intersection:
def i(y): ((unique + (y|unique)) | sort) as $sorted
| reduce range(1; $sorted|length) as $i
([]; if $sorted[$i] == $sorted[$i-1] then . + [$sorted[$i]] else . end) ;
reduce .[1:][] as $a (.[0]; i($a)) ;
Examples:
[ [1,2,4], [2,4,5], [4,5,6]] #=> [4]
[[]] #=> []
[] #=> null
Of course if x and y are already known to be sorted and/or unique, more efficient solutions are possible. See in particular Finite Sets of JSON Entities
$ jq -r '[ .[].list1[] ] | join(" ")' file
val1 val2 val3 val4 val5 val6
Create a new array with all the elements of each list1 array from each top-level key. Then, join its elements with spaces. This would give you the values in the order they occur in the input file.
An alternative (and arguably neater) approach is with map(.list1) which returns an array of arrays that you may flatten and join up:
$ jq -r 'map(.list1) | flatten | join(" ")' file
val1 val2 val3 val4 val5 val6
Your attempt generates one joined string per top-level key due to .list being one of the list1 arrays in turn. Your approach would work if you encapsulated everything up to the last pipe symbol in a [ ... ] (and expand the .list with .list[]) to generate a single array that you then join. This is what I do in my first approach above; only I use a slightly shorter expression to generate the elements of that array.
$ jq -r '[ to_entries[] | { list: .value.list1 } | .list[] ] | join(" ")' file
val1 val2 val3 val4 val5 val6
Using Raku (formerly known as Perl_6)
~$ raku -MJSON::Tiny -e 'my %hash = from-json($_) given lines;
my @a = %hash.values.map({ $_.values if $_{"list1"} });
.say for @a.sort.join(" ");' file
OR:
~$ raku -MJSON::Tiny -e 'my %hash = from-json($_) given lines;
for %hash.values.sort() { print .values.sort ~ " " if $_{"list1"} };
put "";' file
Raku is a programming language in the Perl-family that provides high-level support for Unicode. Like Perl, Raku has associative arrays (hashes and/or maps) built-in. The above code is admittedly rather verbose (first example), but you should be able to get the flavor of the language from both examples above:
- Raku's community-supported
JSON::Tinyis called at the command line, - All
linesaregivenas one data element to thefrom-jsonfunction, which decodes the input and stores it in%hash, - First Example: Using a
map, thevaluesof the hash are searched through for"list1"keys. If (if) found, these are stored in the@aarray. Then the@aarray is printed. - Second Example: the
%hashis iterated through usingfor, searched through for"list1"keys, andiffound the associatedvaluesareprinted (with at end-of-line). A finalputcall adds a newline.
Sample Input (includes bogus "list2" elements)
{
"key1": {
"list1": [
"val1",
"val2",
"val3"
]
},
"key2": {
"list1": [
"val4",
"val5"
]
},
"key3": {
"list1": [
"val6"
]
},
"key4": {
"list2": [
"val7"
]
}
}
Sample Output:
val1 val2 val3 val4 val5 val6
Finally, in any programming solution it is often instructive to look at intermediate data-structures. So here's what the %hash looks like after decoding JSON input:
~$ raku -MJSON::Tiny -e 'my %hash = from-json($_) given lines; .say for %hash.sort;' file
key1 => {list1 => [val1 val2 val3]}
key2 => {list1 => [val4 val5]}
key3 => {list1 => [val6]}
key4 => {list2 => [val7]}
https://raku.land/cpan:MORITZ/JSON::Tiny
https://docs.raku.org/language/hashmap
https://raku.org