I am trying to solve a problem that involves reversing list splices, and I am having trouble with the time limit for a test case,, which is 4 seconds. The question:
Farmer John's N cows (1≤N≤100) are standing in a line. The ith cow from the left has label i, for each 1≤i≤N. Farmer John has come up with a new morning exercise routine for the cows. He tells them to repeat the following two-step process exactly K (1≤K≤1000000000) times:
The sequence of cows currently in positions A1…A2 from the left reverse their order (1≤A1<A2≤N). Then, the sequence of cows currently in positions B1…B2 from the left reverse their order (1≤B1<B2≤N). After the cows have repeated this process exactly K times, please output the label of the ith cow from the left for each 1≤i≤N.
SCORING: Test cases 2-3 satisfy K≤100. Test cases 4-13 satisfy no additional constraints.
INPUT FORMAT (file swap.in): The first line of input contains N and K. The second line contains A1 and A2, and the third contains B1 and B2.
OUTPUT FORMAT (file swap.out): On the ith line of output, print the label of the ith cow from the left at the end of the exercise routine.
SAMPLE INPUT:
7 2
2 5
3 7
SAMPLE OUTPUT:
1
2
4
3
5
7
6
Initially, the order of the cows is [1,2,3,4,5,6,7] from left to right. After the first step of the process, the order is [1,5,4,3,2,6,7]. After the second step of the process, the order is [1,5,7,6,2,3,4]. Repeating both steps a second time yields the output of the sample.
Theoretically, you could solve this problem by finding the point where the program repeats, and then simulating the reverse k % frequency
times, where frequency
is the amount of times the simulation is unique.
But my problem is that when the input is:
100 1000000000
1 94
2 98
my program takes over 100 seconds to run. This input is particularly time consuming because it runs the maximum number of iterations, and frequency
is very high.
Current Code:
fin = open("swap.in", 'r')
line = fin.readline().strip().split()
n = int(line[0])
k = int(line[1])
nums = [[int(x)-1 for x in fin.readline().strip().split()]for i in range(2)]
fin.close()
repeated = []
cows = [i for i in range(1, n+1)]
repeat = False
while not repeat:
for i in nums:
cows[i[0]:i[1]+1] = reversed(cows[i[0]:i[1]+1])
if cows[i[0]:i[1]+1] in repeated :
frequency = len(repeated)-1
repeat = True
repeated.append(cows[i[0]:i[1]+1])
cows = [i for i in range(1, n+1)]
for _ in range(k%frequency):
for i in nums:
cows[i[0]:i[1]+1] = reversed(cows[i[0]:i[1]+1])
fout = open("swap.out", 'w')
for i in cows:
fout.write(str(i) + "\n")
fout.close()
If anyone knows a way to solve this issue, please post an answer. Comment if anything isn't clear.
Lets first talk about how we could solve this mathematically, and then work out a solution programmatically.
Lets say the cow at each position is represented by the variable Pi.j
, where i
is the cow index, and j
is the the swap iteration. These variables will each contain an integer corresponding to that cow's unique id. Also, for simplicity, we'll only consider a single reversal operation per iteration, and expand to include both later on.
Starting with P0.0 (0th position, 0th iteration), we want to start defining some equations to give us the cow at a given position on the next iteration. If the cow is outside of the reversal region, this is trivial; the cow does not change. If it is within the reversal region, we'll want to calculate the previous position of the new cow based on the endpoints of the reversal region. Explicitly:
Pi.(j+1) = Pi.j
Pi.(j+1) = P(e1+e2-i).j
, where e1
and e2
are the region endpointsNow, with our rules, we can actually write out a system of equations:
P0.b = 1 * P0.a
P1.b = 1 * P1.a
...
P4.b = 1 * P7.a // reversal region starts
P5.b = 1 * P6.a
...
From here, we can transform these system of questions to a transformation matrix MA
, which when multiplied by a vector of cow ids, will return a new vector of cow ids post-reversal.
This is where things get fancy. We can do the same thing for the other reversal region, making a second matrix MB
, and multiply it with MA
to get an overall matrix M
which does both reversals in one. From here, we can raise the matrix to the power K
(the number of iterations), for a single matrix which will calculate the cows at each position after all reversals take place.
Now, at this point, you're probably questioning the performance of this approach- afterall, we're raising a 100x100 matrix to some power K
up to 109
! Well, we've got a few tricks up our sleeves to make this much faster. First, note that a single cow, any cow, we can logically reason through shuffling across all the reversal operations and determine its end position, all without knowing anything about where the other cows are / which other cows are which.
This means for any given position at any given time, the cow at that position can be defined by exactly one other position on any other iteration (e.g. P12.7 (position 12 iteration 7) can determined exactly by knowing the cow at the corresponding position in iteration, say, iteration 5- P8.5).
This is useful because it means each row of our matrix will have exactly one non-zero element, and that element will have a value of 1
(coefficient of 1
in the system of equations) so we can actually compress our our 100x100 matrix into an array of just 100 values stating which column per row holds the 1
.
Great, we can trivially multiply matrices in O(n^2)
time using this trick. Well, we can actually do a bit better yet- we can actually multiply these in O(n)
time. For each "row" R
(containing just the column index, I
of the 1
value) in the first matrix, look at the I
th row in our second matrix, and get its value C
. Assign our corresponding row R'
in the output matrix to equal C
.
So we can multiply these matrices in ~100 logical steps, but we need raise this matrix to the K
th power, which could be up to 109
. We just eliminated a factor of 100 from our work, just to add it right back! Well, not quite. There's a well-known method for matrix exponentiation called "exponentiation by squaring", where we cleverly stagger multiplying and squaring the result repeatedly, to calculate M^K
in log(K)
iterations/steps. I won't go into detail here since its widely known and well-documented.
Overall, that puts us at O(N log K) time, not too bad!
Update: Here is the functioning code to solve the problem.
def matmul(p, q):
return [q[I] for I in p]
def exp(m, e):
if e == 1:
return m
result = exp(matmul(m, m), e // 2)
if e % 2:
result = matmul(m, result)
return result
def solve(n, k, a1, a2, b1, b2):
a1, a2, b1, b2 = a1-1, a2-1, b1-1, b2-1
cows = list(range(1, n+1))
ma = [a1 + a2 - x if a1 <= x <= a2 else x for x in range(n)]
mb = [b1 + b2 - x if b1 <= x <= b2 else x for x in range(n)]
m0 = matmul(mb, ma)
mk = exp(m0, k)
return [cows[i] for i in mk]
After doing some benchmarks using timeit
for number=100
iterations, these were my findings (timings in seconds, smaller is better):
contributor | time (sec)
------------------------------
blhsing | 7.857691
Michael Szczesny | 5.076418
(mine) | 0.013314
So Michael's improves on blhsing's solution by running about 1.5x faster on my machine, and my solution runs about 381x faster than Michael's, taking roughly a ten-thousandth of a second per run on average.
Update 2: As mentioned in the comment of this answer, the above solution does still scale with K
, so eventually it would become intractable, and this problem seems to emit a repetitive pattern- surely we can exploit that? Well, yes, in fact we can.
The trick is that there's only so many ways we can permute these cows before they reach some previous state that we've seen before, and then form an endless cycle from there. In fact- because of the simplicity of the problem at hand, its typically a relatively small number of shuffles / reversals before we arrive back at a previous state (actually our starting state specifically, since to visit any other cow permutation without first visiting our starting state would imply that there's two distinct states that transition to said state, which cannot be- we have our system of equations which told us otherwise).
So, what we need to do if find this magic number where the cow shuffling / reversals starts repeating, and use that to reduce the exponent (in our matrix exponentiation) from K
to K mod MAGIC
. We're actually going to call this number the multiplicative order of the transition matrix. To start, notice that there may be more than one cycle of cows that shuffle positions, each of which repeats with periods. A subset of 4 cows may cycle positions every 4 iterations, while another subset of 3 cows cycles every 3 iterations, meaning together the 7 cows repeat their starting configuration every 12 iterations.
More generally, we need to find each independent cycle of cow shuffles, get their lengths, and find the least common multiple (LCM) of these periods. Once we have that, take K
mod that, and raise our matrix to that new value. Example code for doing so:
from itertools import count
from functools import reduce
from math import lcm
def order(m):
cycle_lens = []
unvisited = set(m)
while unvisited:
start = head = unvisited.pop()
for size in count(1):
head = m[head]
if head == start:
cycle_lens.append(size)
break
unvisited.discard(head)
return reduce(lcm, cycle_lens)
# And inside solve()-
# replace: mk = exp(m0, k)
# with: mk = exp(m0, k % order(m0))
Here is an approach that is O(N)
work no matter what K
might be.
Condensed explanation: Rewrite the permutation in cycle notation. Use cycle notation to generate the answer. (Except we don't need the full cycle notation.
To illustrate I'll use your example, but I'll find the permutation after 999 steps.
As you note, [1,5,7,6,2,3,4]
is the result one iteration of A then B. Calculating that, in this form, clearly takes O(N)
work. It is a little more convenient to write that as:
{
1: 1,
2: 5,
3: 6,
4: 7,
5: 2,
6: 4,
7: 3
}
Again, this translation takes O(N)
work.
Now let's compute a partial answer. We start with [0,0,0,0,0,0,0]' where
0` means "unknown".
First step, we find 1 -> 1
so the first cycle is just (1)
. Going around this cycle 999 times gives us (1)
again, so we now have [1,0,0,0,0,0,0]
.
Second step, we find that 2 -> 5 -> 2
(note, we're just looking these up in the lookup, so each one is O(1)
work) so the second cycle is (2, 5)
. Going around this cycle 999 steps means we get to fill in 2 values. We now have [1,5,0,0,2,0,0]
.
Third step, we find that 3 -> 6 -> 4 -> 7 -> 3
. So the third cycle is (3, 6, 4, 7)
. Going around this 999 times is like walking 3 steps forward, so now we can fill in: [1,5,7,6,2,3,4]
.
As we go through the rest of the numbers, we find that they are all filled in. So our answer is [1,5,7,6,2,3,4]
.
In general, each number is part of a cycle of length j
. When we find a cycle, we take O(j)
work to find the cycle, and then for another O(j)
work we fill in the answer of what happens to that cycle. Afterwards we will hit j-1
elements that are filled in and skip them. So we get n
elements for O(n)
work, for amortized O(1)
work per element.
The result is O(N)
work to find the final answer.
The main problem with the performance of your code is that you're using a list to keep a history of cow positions in each iteration in order to detect a cycle, which requires O(n) for each membership lookup with the in
operator.
Instead, you can use a set for the purpose, which costs O(1) in membership lookups. But since you still need to then iterate k % i times, where i is the length of the cycle, to get to that specific position in the cycle, it would be better if the set is ordered so you can simply get the (k % i)-indexed entry in the set instead of having to perform that many times of reversals. But since set is unordered in Python, you can instead use a dict, where the keys are ordered since Python 3.6:
from itertools import islice
n, k, a1, a2, b1, b2 = map(int, '''100 1000000000
1 94
2 98'''.split())
cows = list(range(1, n + 1))
history = {}
for i in range(k):
key = tuple(cows)
if key in history:
cows = next(islice(history, k % i, None))
break
history[key] = 1
for bound in map(slice, (a1 - 1, b1 - 1), (a2, b2)):
cows[bound] = reversed(cows[bound])
print(*cows, sep='\n')
This outputs:
71
2
3
74
10
76
7
8
79
15
81
12
13
84
20
86
17
18
89
25
91
22
23
94
30
96
27
28
1
35
4
32
33
6
40
9
37
38
11
45
14
42
43
16
50
19
47
48
21
55
24
52
53
26
60
29
57
58
31
65
34
62
63
36
70
39
67
68
41
75
44
72
73
46
80
49
77
78
51
85
54
82
83
56
90
59
87
88
61
95
64
92
93
66
5
69
97
98
99
100
Demo: https://replit.com/@blhsing/CultivatedPointedArchitect
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With