I want to get the minimum number of letter-swaps needed to convert one string to match a second string. Only adjacent swaps are allowed.
Inputs are: length of strings, string_1, string_2
Some examples:
Length | String 1 | String 2 | Output
-------+----------+----------+-------
3 | ABC | BCA | 2
7 | AABCDDD | DDDBCAA | 16
7 | ZZZAAAA | ZAAZAAZ | 6
Here's my code:
def letters(number, word_1, word_2):
result = 0
while word_1 != word_2:
index_of_letter = word_1.find(word_2[0])
result += index_of_letter
word_1 = word_1.replace(word_2[0], '', 1)
word_2 = word_2[1:]
return result
It gives the correct results, but the calculation should stay under 20 seconds.
Here are two sets of input data (1 000 000 characters long strings): https://ufile.io/8hp46 and https://ufile.io/athxu.
On my setup the first one is executed in around 40 seconds and the second in 4 minutes.
How to calculate the result in less than 20 seconds?
@KennyOstrom's is 90% there. The inversion count is indeed the right angle to look at this problem.
The only bit that is missing is that we need a "relative" inversion count, meaning the number of inversions not to get to normal sort order but to the other word's order. We therefore need to compute the permutation that stably maps word1 to word2 (or the other way round), and then compute the inversion count of that. Stability is important here, because obviously there will be lots of nonunique letters.
Here is a numpy implementation that takes only a second or two for the two large examples you posted. I did not test it extensively, but it does agree with @trincot's solution on all test cases. For the two large pairs it finds 1819136406
and 480769230766
.
import numpy as np
_, word1, word2 = open("lit10b.in").read().split()
word1 = np.frombuffer(word1.encode('utf8')
+ (((1<<len(word1).bit_length()) - len(word1))*b'Z'),
dtype=np.uint8)
word2 = np.frombuffer(word2.encode('utf8')
+ (((1<<len(word2).bit_length()) - len(word2))*b'Z'),
dtype=np.uint8)
n = len(word1)
o1 = np.argsort(word1, kind='mergesort')
o2 = np.argsort(word2, kind='mergesort')
o1inv = np.empty_like(o1)
o1inv[o1] = np.arange(n)
order = o2[o1inv]
sum_ = 0
for i in range(1, len(word1).bit_length()):
order = np.reshape(order, (-1, 1<<i))
oo = np.argsort(order, axis = -1, kind='mergesort')
ioo = np.empty_like(oo)
ioo[np.arange(order.shape[0])[:, None], oo] = np.arange(1<<i)
order[...] = order[np.arange(order.shape[0])[:, None], oo]
hw = 1<<(i-1)
sum_ += ioo[:, :hw].sum() - order.shape[0] * (hw-1)*hw // 2
print(sum_)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With