Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to split an array into chunks with jq?

I have a very large JSON file containing an array. Is it possible to use jq to split this array into several smaller arrays of a fixed size? Suppose my input was this: [1,2,3,4,5,6,7,8,9,10], and I wanted to split it into 3 element long chunks. The desired output from jq would be:

[1,2,3]
[4,5,6]
[7,8,9]
[10]

In reality, my input array has nearly three million elements, all UUIDs.

like image 402
Echo Nolan Avatar asked Jul 19 '18 00:07

Echo Nolan


People also ask

How to split array into chunks?

Splitting the Array Into Even Chunks Using slice() Method The easiest way to extract a chunk of an array, or rather, to slice it up, is the slice() method: slice(start, end) - Returns a part of the invoked array, between the start and end indices.

How do you split an array?

The split() method splits a string into an array of substrings. The split() method returns the new array. The split() method does not change the original string. If (" ") is used as separator, the string is split between words.

How to split an array of objects in javascript?

In the above program, the while loop is used with the splice() method to split an array into smaller chunks of an array. In the splice() method, The first argument specifies the index where you want to split an item. The second argument (here 2) specifies the number of items to split.

How do you split an array in TypeScript?

TypeScript | String split() Method The split() is an inbuilt function in TypeScript which is used to splits a String object into an array of strings by separating the string into sub-strings.


1 Answers

There is an (undocumented) builtin, _nwise, that meets the functional requirements:

$ jq -nc '[1,2,3,4,5,6,7,8,9,10] | _nwise(3)'

[1,2,3]
[4,5,6]
[7,8,9]
[10]

Also:

$ jq -nc '_nwise([1,2,3,4,5,6,7,8,9,10];3)' 
[1,2,3]
[4,5,6]
[7,8,9]
[10]

Incidentally, _nwise can be used for both arrays and strings.

(I believe it's undocumented because there was some doubt about an appropriate name.)

TCO-version

Unfortunately, the builtin version is carelessly defined, and will not perform well for large arrays. Here is an optimized version (it should be about as efficient as a non-recursive version):

def nwise($n):
 def _nwise:
   if length <= $n then . else .[0:$n] , (.[$n:]|_nwise) end;
 _nwise;

For an array of size 3 million, this is quite performant: 3.91s on an old Mac, 162746368 max resident size.

Notice that this version (using tail-call optimized recursion) is actually faster than the version of nwise/2 using foreach shown elsewhere on this page.

like image 99
peak Avatar answered Sep 30 '22 13:09

peak