Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to optimize str.replace() in Python

I am working on a binary string (i.e it only contains 1 and 0) and I need to run a function N number of times. This function replaces any instance of '01' in string to '10'. However, str.replace takes too much time to process the output, especially when the the length of string as well as N can be as big as 10^6.

I have tried implementing regex but it hasn't provided me with any optimization, instead taking more time to perform the task.

For example, if the string given to me is 01011 and N is equal to 1, then the output should be 10101. Similarly, if N becomes 2, the output becomes 11010 and so on.

Are there any optimizations of str.replace in python or is there any bit manipulation I could do to optimize my code?

like image 262
Rosa Avatar asked May 11 '19 11:05

Rosa


People also ask

Is Python replace efficient?

replace is already fairly efficient, but it does need to scan the strings many times and, at least in some cases, create several new strings.

What is the time complexity of replace in Python?

If you look at the Python 3.11 source code (Python replace method), you may see this has an average case of O(N) and a worst case of O(N+M), where N is the length of the string and M is number of occurrences of the substring we're looking to replace.

Why str replace is not working?

1 Answer. You are facing this issue because you are using the replace method incorrectly. When you call the replace method on a string in python you get a new string with the contents replaced as specified in the method call. You are not storing the modified string but are just using the unmodified string.

What does replace () mean in Python?

The replace() in Python returns a copy of the string where all occurrences of a substring are replaced with another substring.


1 Answers

Let's think of the input as bits forming an unsigned integer, possible a very large one. For example:

  1001 1011  # input number X
  0100 1101  # Y = X>>1 = X//2 -- to get the previous bit to the same column
  1001 0010  # Z1 = X & ~Y -- We are looking for 01, i.e. 1 after previous 0
  0001 0010  # Z2 = Z1 with the highest bit cleared, because we don't want
             # to process the implicit 0 before the number
  1010 1101  # output = X + Z2, this adds 1 where 01's are;
             # 1 + 01 = 10, this is what we want

Thus we can process the whole list just with few arithmetic operations.


Update: sample code, I tried to address the comment about leading zeroes.

xstr = input("Enter binary number: ")
x = int(xstr, base=2)
digits = len(xstr)
mask = 2**(digits-1) - 1 
print("{:0{width}b}".format(x,width=digits))

while True:
    z2 = x & ~(x >> 1) & mask
    if z2 == 0:
        print("final state")
        break
    x += z2
    print("{:0{width}b}".format(x,width=digits))
like image 155
VPfB Avatar answered Nov 09 '22 22:11

VPfB