Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Simple idiom to break an n-long list into k-long chunks, when n % k > 0?

In Python, it is easy to break an n-long list into k-size chunks if n is a multiple of k (IOW, n % k == 0). Here's my favorite approach (straight from the docs):

>>> k = 3
>>> n = 5 * k
>>> x = range(k * 5)
>>> zip(*[iter(x)] * k)
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 14)]

(The trick is that [iter(x)] * k produces a list of k references to the same iterator, as returned by iter(x). Then zip generates each chunk by calling each of the k copies of the iterator exactly once. The * before [iter(x)] * k is necessary because zip expects to receive its arguments as "separate" iterators, rather than a list of them.)

The main shortcoming I see with this idiom is that, when n is not a multiple of k (IOW, n % k > 0), the left over entries are just left out; e.g.:

>>> zip(*[iter(x)] * (k + 1))
[(0, 1, 2, 3), (4, 5, 6, 7), (8, 9, 10, 11)]

There's an alternative idiom that is slightly longer to type, produces the same result as the one above when n % k == 0, and has a more acceptable behavior when n % k > 0:

>>> map(None, *[iter(x)] * k)
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 14)]
>>> map(None, *[iter(x)] * (k + 1))
[(0, 1, 2, 3), (4, 5, 6, 7), (8, 9, 10, 11), (12, 13, 14, None)]

At least, here the left over entries are retained, but the last chunk gets padded with None. If one just wants a different value for the padding, then itertools.izip_longest solves the problem.

But suppose the desired solution is one in which the last chunk is left unpadded, i.e.

[(0, 1, 2, 3), (4, 5, 6, 7), (8, 9, 10, 11), (12, 13, 14)]

Is there a simple way to modify the map(None, *[iter(x)]*k) idiom to produce this result?

(Granted, it is not difficult to solve this problem by writing a function (see, for example, the many fine replies to How do you split a list into evenly sized chunks? or What is the most "pythonic" way to iterate over a list in chunks?). Therefore, a more accurate title for this question would be "How to salvage the map(None, *[iter(x)]*k) idiom?", but I think it would baffle a lot of readers.)

I was struck by how easy it is to break a list into even-sized chunks, and how difficult (in comparison!) it is to get rid of the unwanted padding, even though the two problems seem of comparable complexity.

like image 687
kjo Avatar asked Aug 10 '11 02:08

kjo


People also ask

How do you split a list into evenly sized chunks?

The easiest way to split list into equal sized chunks is to use a slice operator successively and shifting initial and final position by a fixed number.

What helps to break down a program into minor and segmented chunks in Python?

array_split, which splits the array into n chunks of equal size.


1 Answers

[x[i:i+k] for i in range(0,n,k)]
like image 52
John La Rooy Avatar answered Oct 20 '22 01:10

John La Rooy