Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python generator that groups another iterable into groups of N [duplicate]

I'm looking for a function that takes an iterable i and a size n and yields tuples of length n that are sequential values from i:

x = [1,2,3,4,5,6,7,8,9,0]
[z for z in TheFunc(x,3)]

gives

[(1,2,3),(4,5,6),(7,8,9),(0)]

Does such a function exist in the standard library?

If it exists as part of the standard library, I can't seem to find it and I've run out of terms to search for. I could write my own, but I'd rather not.

like image 227
BCS Avatar asked Oct 21 '10 23:10

BCS


People also ask

What is __ Iter__ in Python?

The __iter__() function returns an iterator for the given object (array, set, tuple, etc. or custom objects). It creates an object that can be accessed one element at a time using __next__() function, which generally comes in handy when dealing with loops. Syntax : iter(object) iter(callable, sentinel)

Which functions return an iterable with a next () method that yields the next element in a sequence?

The __iter__() method returns the iterator object itself. If required, some initialization can be performed. The __next__() method must return the next item in the sequence. On reaching the end, and in subsequent calls, it must raise StopIteration .

Are Python generators iterable?

Generators are functions having an yield keyword. Any function which has “yield” in it is a generator. Calling a generator function creates an iterable. Since it is an iterable so it can be used with iter() and with a for loop.

What is Islice?

islice (iterable, start, stop[, step]) Make an iterator that returns selected elements from the iterable. If start is non-zero, then elements from the iterable are skipped until start is reached. Afterward, elements are returned consecutively unless step is set higher than one which results in items being skipped.


3 Answers

When you want to group an iterator in chunks of n without padding the final group with a fill value, use iter(lambda: list(IT.islice(iterable, n)), []):

import itertools as IT  def grouper(n, iterable):     """     >>> list(grouper(3, 'ABCDEFG'))     [['A', 'B', 'C'], ['D', 'E', 'F'], ['G']]     """     iterable = iter(iterable)     return iter(lambda: list(IT.islice(iterable, n)), [])  seq = [1,2,3,4,5,6,7] print(list(grouper(3, seq))) 

yields

[[1, 2, 3], [4, 5, 6], [7]] 

There is an explanation of how it works in the second half of this answer.


When you want to group an iterator in chunks of n and pad the final group with a fill value, use the grouper recipe zip_longest(*[iterator]*n):

For example, in Python2:

>>> list(IT.izip_longest(*[iter(seq)]*3, fillvalue='x')) [(1, 2, 3), (4, 5, 6), (7, 'x', 'x')] 

In Python3, what was izip_longest is now renamed zip_longest:

>>> list(IT.zip_longest(*[iter(seq)]*3, fillvalue='x')) [(1, 2, 3), (4, 5, 6), (7, 'x', 'x')] 

When you want to group a sequence in chunks of n you can use the chunks recipe:

def chunks(seq, n):     # https://stackoverflow.com/a/312464/190597 (Ned Batchelder)     """ Yield successive n-sized chunks from seq."""     for i in xrange(0, len(seq), n):         yield seq[i:i + n] 

Note that, unlike iterators in general, sequences by definition have a length (i.e. __len__ is defined).

like image 91
unutbu Avatar answered Oct 11 '22 09:10

unutbu


See the grouper recipe in the docs for the itertools package

def grouper(n, iterable, fillvalue=None):
  "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
  args = [iter(iterable)] * n
  return izip_longest(fillvalue=fillvalue, *args)

(However, this is a duplicate of quite a few questions.)

like image 38
Andrew Jaffe Avatar answered Oct 11 '22 09:10

Andrew Jaffe


How about this one? It doesn't have a fill value though.

>>> def partition(itr, n):
...     i = iter(itr)
...     res = None
...     while True:
...             res = list(itertools.islice(i, 0, n))
...             if res == []:
...                     break
...             yield res
...
>>> list(partition([1, 2, 3, 4, 5, 6, 7, 8, 9], 3))
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>>

It utilizes a copy of the original iterable, which it exhausts for each successive splice. The only other way my tired brain could come up with was generating splice end-points with range.

Maybe I should change list() to tuple() so it better corresponds to your output.

like image 24
Skurmedel Avatar answered Oct 11 '22 08:10

Skurmedel