Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace two adjacent duplicate characters in string with the next character in alphabet

I want to write a function, that will find the first occurrence of two adjacent characters that are the same, replace them with a single character that is next in the alphabet and go over the string until there are no duplicates left. In case of "zz" it should go in a circular fashion back to "a". The string can only include characters a-z, that is, no capital letters or non-alphabetical characters. I have written a function that does it, but it is not effective enough for a very long string.

def solve(s):
    i = 1
    while i < len(s):
        if s[i] == s[i-1]:
            r = s[i+1:]
            l = s[:i-1]
            if s[i] == "z":
                x = "a"
            else:
                x = chr(ord(s[i])+1)
            i = 1
            s = l+x+r
        else:
            i += 1
    return s

So for example if s = 'aabbbc' the function should work like aabbbc --> bbbbc --> cbbc --> ccc and finally return dc. How can I make it more efficient?

Edit: for example if s = 'ab'*10**4 + 'cc'*10**4 + 'dd'*10**4 this function is taking a lot of time.

like image 888
user932895 Avatar asked Oct 29 '25 19:10

user932895


2 Answers

As a trivial optimisation: instead of the hard reset i = 1, you can use a softer reset i = i-1. Indeed, if there was no duplicate between 1 and i before the transformation, then there still won't be any duplicate between 1 and i-1 after the transformation.

def solve(s):
    i = 1
    while i < len(s):
        if s[i] == s[i-1]:
            l = s[:i-1]
            r = s[i+1:]  # I swapped these two lines because I read left-to-right
            if s[i] == "z":
                x = "a"
            else:
                x = chr(ord(s[i])+1)
            s = l+x+r
            i = max(1, i-1)  # important change is here
        else:
            i += 1
    return s

s = 'ab'*10**4 + 'cc'*10**4 + 'dd'*10**4
t = solve(s)

print(t[2*10**4:])
# rqpnlih

print(t == 'ab'*10**4 + 'rqpnlih')
# True
like image 87
Stef Avatar answered Oct 31 '25 10:10

Stef


You can build the result in a list that you use like a stack, adding letters that are different from the last one and popping back with the next letter when they match:

def compact(s):
    result = []
    nextChar = str.maketrans({i+96:i%26+97 for i in range(1,27)})
    for c in s:
        while result and result[-1] == c:
            c = result.pop(-1).translate(nextChar)
        result.append(c)
    return "".join(result)
    

output:

print(compact("aabbbc"))
# dc

s = 'ab'*10**4 + 'cc'*10**4 + 'dd'*10**4
print(compact(s)[-11:])
# ababrqpnlih
like image 26
Alain T. Avatar answered Oct 31 '25 08:10

Alain T.



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!