In Python, it is easy to break an n-long list into k-size chunks if n is a multiple of k (IOW, n % k == 0
). Here's my favorite approach (straight from the docs):
>>> k = 3
>>> n = 5 * k
>>> x = range(k * 5)
>>> zip(*[iter(x)] * k)
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 14)]
(The trick is that [iter(x)] * k
produces a list of k references to the same iterator, as returned by iter(x)
. Then zip
generates each chunk by calling each of the k copies of the iterator exactly once. The *
before [iter(x)] * k
is necessary because zip
expects to receive its arguments as "separate" iterators, rather than a list of them.)
The main shortcoming I see with this idiom is that, when n is not a multiple of k (IOW, n % k > 0
), the left over entries are just left out; e.g.:
>>> zip(*[iter(x)] * (k + 1))
[(0, 1, 2, 3), (4, 5, 6, 7), (8, 9, 10, 11)]
There's an alternative idiom that is slightly longer to type, produces the same result as the one above when n % k == 0
, and has a more acceptable behavior when n % k > 0
:
>>> map(None, *[iter(x)] * k)
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 14)]
>>> map(None, *[iter(x)] * (k + 1))
[(0, 1, 2, 3), (4, 5, 6, 7), (8, 9, 10, 11), (12, 13, 14, None)]
At least, here the left over entries are retained, but the last chunk gets padded with None
. If one just wants a different value for the padding, then itertools.izip_longest
solves the problem.
But suppose the desired solution is one in which the last chunk is left unpadded, i.e.
[(0, 1, 2, 3), (4, 5, 6, 7), (8, 9, 10, 11), (12, 13, 14)]
Is there a simple way to modify the map(None, *[iter(x)]*k)
idiom to produce this result?
(Granted, it is not difficult to solve this problem by writing a function (see, for example, the many fine replies to How do you split a list into evenly sized chunks? or What is the most "pythonic" way to iterate over a list in chunks?). Therefore, a more accurate title for this question would be "How to salvage the map(None, *[iter(x)]*k)
idiom?", but I think it would baffle a lot of readers.)
I was struck by how easy it is to break a list into even-sized chunks, and how difficult (in comparison!) it is to get rid of the unwanted padding, even though the two problems seem of comparable complexity.
The easiest way to split list into equal sized chunks is to use a slice operator successively and shifting initial and final position by a fixed number.
array_split, which splits the array into n chunks of equal size.
[x[i:i+k] for i in range(0,n,k)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With