I have a very large JSON file containing an array. Is it possible to use jq
to split this array into several smaller arrays of a fixed size? Suppose my input was this: [1,2,3,4,5,6,7,8,9,10]
, and I wanted to split it into 3 element long chunks. The desired output from jq
would be:
[1,2,3]
[4,5,6]
[7,8,9]
[10]
In reality, my input array has nearly three million elements, all UUIDs.
Splitting the Array Into Even Chunks Using slice() Method The easiest way to extract a chunk of an array, or rather, to slice it up, is the slice() method: slice(start, end) - Returns a part of the invoked array, between the start and end indices.
The split() method splits a string into an array of substrings. The split() method returns the new array. The split() method does not change the original string. If (" ") is used as separator, the string is split between words.
In the above program, the while loop is used with the splice() method to split an array into smaller chunks of an array. In the splice() method, The first argument specifies the index where you want to split an item. The second argument (here 2) specifies the number of items to split.
TypeScript | String split() Method The split() is an inbuilt function in TypeScript which is used to splits a String object into an array of strings by separating the string into sub-strings.
There is an (undocumented) builtin, _nwise
, that meets the functional requirements:
$ jq -nc '[1,2,3,4,5,6,7,8,9,10] | _nwise(3)'
[1,2,3]
[4,5,6]
[7,8,9]
[10]
Also:
$ jq -nc '_nwise([1,2,3,4,5,6,7,8,9,10];3)'
[1,2,3]
[4,5,6]
[7,8,9]
[10]
Incidentally, _nwise
can be used for both arrays and strings.
(I believe it's undocumented because there was some doubt about an appropriate name.)
Unfortunately, the builtin version is carelessly defined, and will not perform well for large arrays. Here is an optimized version (it should be about as efficient as a non-recursive version):
def nwise($n):
def _nwise:
if length <= $n then . else .[0:$n] , (.[$n:]|_nwise) end;
_nwise;
For an array of size 3 million, this is quite performant: 3.91s on an old Mac, 162746368 max resident size.
Notice that this version (using tail-call optimized recursion) is actually faster than the version of nwise/2
using foreach
shown elsewhere on this page.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With