What is the fastest way to reposition a sublist from a list in Python?
Let's say we have a list L = [a,b,c,d,e,f,g,h]
, now I want to take [c,d,e]
and put it after g
in the list. How can I do this fast?
Edit: In other words I would like to write a function that:
The main question I guess is how to insert a list into list as fast as possible.
I assume OP wants to do this inplace.
The key to making the operation fast is to minimize the creation of lists and the shortening/lengthening of lists. This means we must strive to always do a 1:1 assignment of list indices, so no L[i:i] = L[a:b]
and no L[a:b] = []
. Using loops with insert
and pop
is even worse, because then you shorten and lengthen the list many times. Concatenating lists is also bad because you first have to create one list for each part and then create larger and larger concatenated lists, once for each +
. Since you want to do this "inplace", you'd have to assign the generated list to L[:]
in the end.
# items: 0 | 1 2 3 | 4 5 6 7 | 8 9
# a span1 b span2 c
# pos: 1 4 8
# Result:
# 0 | 4 5 6 7 | 1 2 3 | 8 9
# a span2 span2 c
Lets first make an observation. If a = start
, b = end = start + length
, and c
is the insert position, then the operation we wish to do is to cut at the |
markers and swap span1
and span2
. But if b = start
and c = end
and a
is the insert position, then we also want to swap span1
and span2
. So in our function, we just deal with two consecutive segments that must be swapped.
We can't wholly avoid making new lists, because we need to store overlapping values while moving stuff around. We can however make the list as short as possible, by choosing which of the two spans to store to a temporary list.
def inplace_shift(L, start, length, pos):
if pos > start + length:
(a, b, c) = (start, start + length, pos)
elif pos < start:
(a, b, c) = (pos, start, start + length)
else:
raise ValueError("Cannot shift a subsequence to inside itself")
if not (0 <= a < b < c <= len(L)):
msg = "Index check 0 <= {0} < {1} < {2} <= {3} failed."
raise ValueError(msg.format(a, b, c, len(L)))
span1, span2 = (b - a, c - b)
if span1 < span2:
tmp = L[a:b]
L[a:a + span2] = L[b:c]
L[c - span1:c] = tmp
else:
tmp = L[b:c]
L[a + span2:c] = L[a:b]
L[a:a + span2] = tmp
Kos seems to have made an error in his timings, so I redid them with his code after correcting the arguments (calculating end
from start
and length
), and these are the results, from slowest to fastest.
Nick Craig-Wood: 100 loops, best of 3: 8.58 msec per loop
vivek: 100 loops, best of 3: 4.36 msec per loop
PaulP.R.O. (deleted?): 1000 loops, best of 3: 838 usec per loop
unbeli: 1000 loops, best of 3: 264 usec per loop
lazyr: 10000 loops, best of 3: 44.6 usec per loop
I have not tested that any of the other approaches yield correct results.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With