Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

split a generator/iterable every n items in python (splitEvery)

I'm trying to write the Haskell function 'splitEvery' in Python. Here is it's definition:

splitEvery :: Int -> [e] -> [[e]]     @'splitEvery' n@ splits a list into length-n pieces.  The last     piece will be shorter if @n@ does not evenly divide the length of     the list. 

The basic version of this works fine, but I want a version that works with generator expressions, lists, and iterators. And, if there is a generator as an input it should return a generator as an output!

Tests

# should not enter infinite loop with generators or lists splitEvery(itertools.count(), 10) splitEvery(range(1000), 10)  # last piece must be shorter if n does not evenly divide assert splitEvery(5, range(9)) == [[0, 1, 2, 3, 4], [5, 6, 7, 8]]  # should give same correct results with generators tmp = itertools.islice(itertools.count(), 10) assert list(splitEvery(5, tmp)) == [[0, 1, 2, 3, 4], [5, 6, 7, 8]] 

Current Implementation

Here is the code I currently have but it doesn't work with a simple list.

def splitEvery_1(n, iterable):     res = list(itertools.islice(iterable, n))     while len(res) != 0:         yield res         res = list(itertools.islice(iterable, n)) 

This one doesn't work with a generator expression (thanks to jellybean for fixing it):

def splitEvery_2(n, iterable):      return [iterable[i:i+n] for i in range(0, len(iterable), n)] 

There has to be a simple piece of code that does the splitting. I know I could just have different functions but it seems like it should be and easy thing to do. I'm probably getting stuck on an unimportant problem but it's really bugging me.


It is similar to grouper from http://docs.python.org/library/itertools.html#itertools.groupby but I don't want it to fill extra values.

def grouper(n, iterable, fillvalue=None):     "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"     args = [iter(iterable)] * n     return izip_longest(fillvalue=fillvalue, *args) 

It does mention a method that truncates the last value. This isn't what I want either.

The left-to-right evaluation order of the iterables is guaranteed. This makes possible an idiom for clustering a data series into n-length groups using izip(*[iter(s)]*n).

list(izip(*[iter(range(9))]*5)) == [[0, 1, 2, 3, 4]] # should be [[0, 1, 2, 3, 4], [5, 6, 7, 8]] 
like image 577
James Brooks Avatar asked Dec 16 '09 14:12

James Brooks


People also ask

How many times can you iterate through a generator?

This is because generators, like all iterators, can be exhausted. Unless your generator is infinite, you can iterate through it one time only. Once all values have been evaluated, iteration will stop and the for loop will exit. If you used next() , then instead you'll get an explicit StopIteration exception.

Are Python generators iterable?

Generators are functions having an yield keyword. Any function which has “yield” in it is a generator. Calling a generator function creates an iterable. Since it is an iterable so it can be used with iter() and with a for loop.

What is Islice in Python?

islice() - The islice() function allows the user to loop through an iterable with a start and stop , and returns a generator. map() - The map() function creates an iterable map object that applies a specified transformation to every element in a chosen iterable.


1 Answers

from itertools import islice  def split_every(n, iterable):     i = iter(iterable)     piece = list(islice(i, n))     while piece:         yield piece         piece = list(islice(i, n)) 

Some tests:

>>> list(split_every(5, range(9))) [[0, 1, 2, 3, 4], [5, 6, 7, 8]]  >>> list(split_every(3, (x**2 for x in range(20)))) [[0, 1, 4], [9, 16, 25], [36, 49, 64], [81, 100, 121], [144, 169, 196], [225, 256, 289], [324, 361]]  >>> [''.join(s) for s in split_every(6, 'Hello world')] ['Hello ', 'world']  >>> list(split_every(100, [])) [] 
like image 125
Roberto Bonvallet Avatar answered Oct 13 '22 21:10

Roberto Bonvallet