Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Removing substring from string in Python

Tags:

python

I want to write a function that takes 2 inputs: a string and a substring, then the function will remove that part of the substring from the string.

def remove_substring(s, substr):
   """(str, str) -> NoneType
Returns string without string substr

remove_substring_from_string("Im in cs", "cs")
Im in
    """
    other_s = ''
for substr in s:
    if substr in s:
        continue

How do I continue on from here? Assuming my logic is sound.

like image 875
Michel Hijazin Avatar asked Nov 17 '25 04:11

Michel Hijazin


2 Answers

Avoiding the use of Python functions.

Method 1

def remove_substring_from_string(s, substr):
    '''
    find start index in s of substring
    remove it by skipping over it
    '''
    i = 0
    while i < len(s) - len(substr) + 1:
        # Check if substring starts at i
        if s[i:i+len(substr)] == substr:
            break   
        i += 1
    else:
        # break not hit, so substr not found
        return s
    
    # break hit
    return s[:i] + s[i+len(substr):]

Method 2

If the range function can be used, the above can be written more compactly as follows.

def remove_substring_from_string(s, substr):
    '''
    find start index in s of substring
    remove it by skipping over it
    '''
    for i in range(len(s) - len(substr) + 1):
        if s[i:i+len(substr)] == substr:
            break
    else:
        # break not hit, so substr not found
        return s
    
    return s[:i] + s[i+len(substr):]

Test

print(remove_substring_from_string("I have nothing to declare except my genuis", " except my genuis"))
# Output: I have nothing to declare'
like image 85
DarrylG Avatar answered Nov 18 '25 18:11

DarrylG


This approach is based on the KMP algorithm:

def KMP(s):
    n = len(s)
    pi = [0 for _ in range(n)]

    for i in range(1, n):
        j = pi[i - 1]
        while j > 0 and s[i] != s[j]:
            j = pi[j - 1]
        
        if s[i] == s[j]:
            j += 1
        
        pi[i] = j
    
    return pi

# Removes all occurences of t in s
def remove_substring_from_string(s, t):
    n = len(s)
    m = len(t)
    
    # Calculate the prefix function using KMP
    pi = KMP(t + '\x00' + s)[m + 1:]
    r = ""

    i = 0
    while i + m - 1 < n: # Before the remaining string is smaller than the substring
        if pi[i + m - 1] == m: # If the substring is here, skip it
            i += m
        else: # Otherwise, add the current character and move to the next
            r += s[i]
            i += 1
    
    # Add the remaining string
    r += s[i:]
    return r

It runs in O(|s| + |t|), but it has a few downsides:

  • The code is long and unintuitive.
  • It requires that there is no null (\x00) in the input strings.
  • Its constant factors are pretty bad for short s and t.
  • It doesn't handle overlapping strings how you might want it: remove_substring_from_string("aaa", "aa") will return "a". The only guarantee made is that t in remove_substring_from_string(s, t) is False for any two strings s and t.

A C++ example and further explanation for the KMP algorithm can be found here. The remove_substring_from_string function then only checks if the entire substring is matched at each position and if so, skips over the substring.

like image 32
dranjohn Avatar answered Nov 18 '25 20:11

dranjohn



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!