Earlier, I was trying to answer a question where I wanted to iterate over a list slice as efficiently as possible.
for x in lst[idx1:]:
isn't ideal as it creates a copy (In general, this is O(n)
). My next thought was to use itertools.islice
. But if you look at the documentation, it appears that islice
will call next
until it finds the index it is looking for at which point it will start to yield values. This is also O(n)
. It seems that there is an optimization that is available here if the object passed to islice
is a list
or a tuple
-- It seems that you could iterate over the "slice" directly (in C) without actually making a copy. I was curious if this optimization is in the source, But I didn't find anything. I'm not extremely familiar with C and the python source tree, so it's entirely possible that I missed it.
My question is this:
Is there a way to iterate over a list "slice" without making a copy of the list slice and without burning through a bunch of unwanted elements (in an optimized C implementation)?
I'm well aware that I could write my own generator for this (very naively, not accounting for the fact that many of the arguments should be optional, etc.):
def myslice(obj,start,stop,stride):
for i in xrange(start,stop,stride):
yield obj[i]
but that's definitely not going to beat an optimized C implementation.
If you're wondering why I would need this over just looping over a slice directly, consider the difference between:
takewhile(lambda x: x == 5, lst[idx:]) #copy's the tail of the list unnecessarily
and
takewhile(lambda x: x == 5, islice(lst,idx,None)) #inspects the head of the list unnecessarily
and finally:
takewhile(lambda x: x == 5, magic_slice(lst,idx,None)) #How to create magic_slice???
I think it's worth mentioning that NumPy slices are non-copying (they create a view onto the underlying array). Therefore, if you can use NumPy arrays for your data, that would solve the problem. On top of that you could get additional performance improvements through vectorization.
Is there a way to iterate over a list "slice" without making a copy of the list slice and without burning through a bunch of unwanted elements (in an optimized C implementation)?
Yes there is, if you write that C implementation. Cython makes this particularly easy.
cdef class ListSlice(object):
cdef object seq
cdef Py_ssize_t start, end
def __init__(self, seq, Py_ssize_t start, Py_ssize_t end):
self.seq = seq
self.start = start
self.end = end
def __iter__(self):
return self
def __next__(self):
if self.start == self.end:
raise StopIteration()
r = self.seq[self.start]
self.start += 1
return r
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With