I'm looking for an efficient way of achieving this, which I think is a slicing-like operation:
>>> mylist = range(100) >>>magicslicer(mylist, 10, 20) [0,1,2,3,4,5,6,7,8,9,30,31,32,33,34,35,36,37,38,39,60,61,62,63......,97,98,99]
the idea is: the slicing gets 10 elements, then skips 20 elements, then gets next 10, then skips next 20, and so on.
I think I should not use loops if possible, for the very reason to use slice is (I guess) to do the "extraction" efficiently in a single operation.
Thanks for reading.
In short, slicing is a flexible tool to build new lists out of an existing list. Python supports slice notation for any sequential data type like lists, strings, tuples, bytes, bytearrays, and ranges. Also, any new data structure can add its support as well.
The format for list slicing is [start:stop:step]. start is the index of the list where slicing starts. stop is the index of the list where slicing ends. step allows you to select nth item within the range start to stop.
As well as using slicing to extract part of a list (i.e. a slice on the right hand sign of an equal sign), you can set the value of elements in a list by using a slice on the left hand side of an equal sign. In python terminology, this is because lists are mutable objects, while strings are immutable.
Second note, when no start is defined as in A[:2] , it defaults to 0. There are two ends to the list: the beginning where index=0 (the first element) and the end where index=highest value (the last element).
itertools.compress
(new in 2.7/3.1) nicely supports use cases like this one, especially when combined with itertools.cycle
:
from itertools import cycle, compress seq = range(100) criteria = cycle([True]*10 + [False]*20) # Use whatever pattern you like >>> list(compress(seq, criteria)) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]
Python 2.7 timing (relative to Sven's explicit list comprehension):
$ ./python -m timeit -s "a = range(100)" "[x for start in range(0, len(a), 30) for x in a[start:start+10]]" 100000 loops, best of 3: 4.96 usec per loop $ ./python -m timeit -s "from itertools import cycle, compress" -s "a = range(100)" -s "criteria = cycle([True]*10 + [False]*20)" "list(compress(a, criteria))" 100000 loops, best of 3: 4.76 usec per loop
Python 3.2 timing (also relative to Sven's explicit list comprehension):
$ ./python -m timeit -s "a = range(100)" "[x for start in range(0, len(a), 30) for x in a[start:start+10]]" 100000 loops, best of 3: 7.41 usec per loop $ ./python -m timeit -s "from itertools import cycle, compress" -s "a = range(100)" -s "criteria = cycle([True]*10 + [False]*20)" "list(compress(a, criteria))" 100000 loops, best of 3: 4.78 usec per loop
As can be seen, it doesn't make a great deal of difference relative to the in-line list comprehension in 2.7, but helps significantly in 3.2 by avoiding the overhead of the implicit nested scope.
A similar difference can also be seen in 2.7 if the aim is to iterate over the resulting sequence rather than turn it into a fully realised list:
$ ./python -m timeit -s "a = range(100)" "for x in (x for start in range(0, len(a), 30) for x in a[start:start+10]): pass" 100000 loops, best of 3: 6.82 usec per loop $ ./python -m timeit -s "from itertools import cycle, compress" -s "a = range(100)" -s "criteria = cycle([True]*10 + [False]*20)" "for x in compress(a, criteria): pass" 100000 loops, best of 3: 3.61 usec per loop
For especially long patterns, it is possible to replace the list in the pattern expression with an expression like chain(repeat(True, 10), repeat(False, 20))
so that it never has to be fully created in memory.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With