I've got a large iterable, in fact, a large iterable given by:
itertools.permutations(range(10))
I would like to access to the millionth element. I alredy have problem solved in some different ways.
Casting iterable to list and getting 1000000th element:
return list(permutations(range(10)))[999999]
Manually skiping elements till 999999:
p = permutations(range(10))
for i in xrange(999999): p.next()
return p.next()
Manually skiping elements v2:
p = permutations(range(10))
for i, element in enumerate(p):
if i == 999999:
return element
Using islice from itertools:
return islice(permutations(range(10)), 999999, 1000000).next()
But I still don't feel like none of them is the python's elegant way to do that. First option is just too expensive, it needs to compute the whole iterable just to access a single element. If I'm not wrong, islice does internally the same computation I just did in method 2, and is almost exactly as 3rd, maybe it has even more redundant operations.
So, I'm just curious, wondering if there is in python some other way to access to a concrete element of an iterable, or at least to skip the first elements, in some more elegant way, or if I just need to use one of the aboves.
If you want to force your iterable to skip forwards, you must call . next() .
The break statement can be used if you need to break out of a for or while loop and move onto the next section of code. The continue statement can be used if you need to skip the current iteration of a for or while loop and move onto the next iteration.
Using next() this way can raise a StopIteration exception, if the iterable is out of values. The islice(song_iter, 3, 4) iterable will skip 3 elements, then return the 4th, then be done. Calling next() on that object thus retrieves the 4th element from song_iter() .
Use the itertools
recipe consume
to skip n
elements:
def consume(iterator, n):
"Advance the iterator n-steps ahead. If n is none, consume entirely."
# Use functions that consume iterators at C speed.
if n is None:
# feed the entire iterator into a zero-length deque
collections.deque(iterator, maxlen=0)
else:
# advance to the empty slice starting at position n
next(islice(iterator, n, n), None)
Note the islice()
call there; it uses n, n
, effectively not returning anything, and the next()
function falls back to the default.
Simplified to your example, where you want to skip 999999 elements, then return element 1000000:
return next(islice(permutations(range(10)), 999999, 1000000))
islice()
processes the iterator in C, something that Python loops cannot beat.
To illustrate, here are the timings for just 10 repeats of each method:
>>> from itertools import islice, permutations
>>> from timeit import timeit
>>> def list_index():
... return list(permutations(range(10)))[999999]
...
>>> def for_loop():
... p = permutations(range(10))
... for i in xrange(999999): p.next()
... return p.next()
...
>>> def enumerate_loop():
... p = permutations(range(10))
... for i, element in enumerate(p):
... if i == 999999:
... return element
...
>>> def islice_next():
... return next(islice(permutations(range(10)), 999999, 1000000))
...
>>> timeit('f()', 'from __main__ import list_index as f', number=10)
5.550895929336548
>>> timeit('f()', 'from __main__ import for_loop as f', number=10)
1.6166789531707764
>>> timeit('f()', 'from __main__ import enumerate_loop as f', number=10)
1.2498459815979004
>>> timeit('f()', 'from __main__ import islice_next as f', number=10)
0.18969106674194336
The islice()
method is nearly 7 times faster than the next fastest method.
Finding the nth permutation may just be an example but if this is actually the problem you are trying to solve then there is a much better way to do this. Instead of skipping the elements of the iterable you can calculate the nth permutation directly. Borrowing the code from another answer here:
import math
def nthperm(li, n):
li = list(li)
n -= 1
s = len(li)
res = []
if math.factorial(s) <= n:
return None
for x in range(s-1,-1,-1):
f = math.factorial(x)
d = n / f
n -= d * f
res.append(li[d])
del(li[d])
return res
Example and timing comparison:
In [4]: nthperm(range(10), 1000000)
Out[4]: [2, 7, 8, 3, 9, 1, 5, 4, 6, 0]
In [5]: next(islice(permutations(range(10)), 999999, 1000000))
Out[5]: (2, 7, 8, 3, 9, 1, 5, 4, 6, 0)
In [6]: %timeit nthperm(range(10), 1000000)
100000 loops, best of 3: 9.01 us per loop
In [7]: %timeit next(islice(permutations(range(10)), 999999, 1000000))
10 loops, best of 3: 29.5 ms per loop
Same answer, over 3000 times faster. Note that I did make a slight modification to the original code so that it will no longer destroy the original list.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With