Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python best way to 'swap' words (multiple characters) in a string?

Consider the following examples:

string_now = 'apple and avocado'
stringthen = string_now.swap('apple', 'avocado') # stringthen = 'avocado and apple'

and:

string_now = 'fffffeeeeeddffee'
stringthen = string_now.swap('fffff', 'eeeee') # stringthen = 'eeeeefffffddffee'

Approaches discussed in Swap character of string in Python do not work, as the mapping technique used there only takes one character into consideration. Python's built-in str.maketrans() also only supports one-character translations, as when I try to do multiple characters, it throws the following error:

Traceback (most recent call last):
  File "main.py", line 4, in <module>
    s.maketrans(mapper)
ValueError: string keys in translate table must be of length 1

A chain of replace() methods is not only far from ideal (since I have many replacements to do, chaining replaces would be a big chunk of code) but because of its sequential nature, it will not translate things perfectly as:

string_now = 'apple and avocado'
stringthen = string_now.replace('apple', 'avocado').replace('avocado', 'apple')

gives 'apple and apple' instead of 'avocado and apple'.

What's the best way to achieve this?

like image 884
Hamza Avatar asked Dec 03 '21 03:12

Hamza


People also ask

How do you swap multiple characters in a string in Python?

A character in Python is also a string. So, we can use the replace() method to replace multiple characters in a string. It replaced all the occurrences of, Character 's' with 'X'.

How do you replace multiple words in a string in Python?

Use the translate() method to replace multiple different characters. You can create the translation table specified in translate() by the str. maketrans() . Specify a dictionary whose key is the old character and whose value is the new string in the str.

How do you replace multiple values in Python?

To replace multiple values in a DataFrame we can apply the method DataFrame. replace(). In Pandas DataFrame replace method is used to replace values within a dataframe object.


4 Answers

Given that we want to swap words x and y, and that we don't care about the situation where they overlap, we can:

  • split the string on occurrences of x
  • within each piece, replace y with x
  • join the pieces with y

Essentially, we use split points within the string as a temporary marker to avoid the problem with sequential replacements.

Thus:

def swap_words(s, x, y):
    return y.join(part.replace(y, x) for part in s.split(x))

Test it:

>>> swap_words('apples and avocados and avocados and apples', 'apples', 'avocados')
'avocados and apples and apples and avocados'
>>>
like image 146
Karl Knechtel Avatar answered Oct 23 '22 23:10

Karl Knechtel


Two regex solutions and one for other people who do have a character that can't appear (there are over a million different possible characters, after all) and who don't dislike replace chains :-)

def swap_words_regex1(s, x, y):
    return re.sub(re.escape(x) + '|' + re.escape(y),
                  lambda m: (x if m[0] == y else y),
                  s)

def swap_words_regex2(s, x, y):
    return re.sub(f'({re.escape(x)})|{re.escape(y)}',
                  lambda m: x if m[1] is None else y,
                  s)

def swap_words_replaces(s, x, y):
    return s.replace(x, chr(0)).replace(y, x).replace(chr(0), y)

Some benchmark results:

 3.7 ms  1966 kB  swap_words_split
10.7 ms  2121 kB  swap_words_regex1
17.8 ms  2121 kB  swap_words_regex2
 1.3 ms   890 kB  swap_words_replaces

Full code (Try it online!):

from timeit import repeat
import re
import tracemalloc as tm

def swap_words_split(s, x, y):
    return y.join(part.replace(y, x) for part in s.split(x))

def swap_words_regex1(s, x, y):
    return re.sub(re.escape(x) + '|' + re.escape(y),
                  lambda m: (x if m[0] == y else y),
                  s)

def swap_words_regex2(s, x, y):
    return re.sub(f'({re.escape(x)})|{re.escape(y)}',
                  lambda m: x if m[1] is None else y,
                  s)

def swap_words_replaces(s, x, y):
    return s.replace(x, chr(0)).replace(y, x).replace(chr(0), y)

funcs = swap_words_split, swap_words_regex1, swap_words_regex2, swap_words_replaces

args = 'apples and avocados and bananas and oranges and ' * 10000, 'apples', 'avocados'

for _ in range(3):
    for func in funcs:
        t = min(repeat(lambda: func(*args), number=1))
        tm.start()
        func(*args)
        memory = tm.get_traced_memory()[1]
        tm.stop()
        print(f'{t * 1e3:4.1f} ms  {memory // 1000:4} kB  {func.__name__}')
    print()
like image 8
Kelly Bundy Avatar answered Oct 24 '22 00:10

Kelly Bundy


This solution uses str.format():

string_now = "apple and avocado"
stringthen = (  # "avocado and apple"
    string_now.replace("apple", "{apple}")
    .replace("avocado", "{avocado}")
    .format(apple="avocado", avocado="apple")
)

# Edit: as a function
def swap_words(s, x, y):
    return s.replace(x, "{" + x + "}")
            .replace(y, "{" + y + "}")
            .format(**{x: y, y: x})

It first adds curly brackets before and after keywords to turn them into placeholders. Then str.format() is used to replace the placeholders.

like image 3
Stefan_EOX Avatar answered Oct 23 '22 22:10

Stefan_EOX


Why not just use a temp string which will never be in the origin string?

for example:

>>> a = 'apples and avocados and avocados and apples'
>>> b = a.replace('apples', '#IamYourFather#').replace('avocados', 'apples').replace('#IamYourFather#', 'avocados')
>>> print(b)
avocados and apples and apples and avocados

where #IamYourFather# is a string which will never be in the origin string.

like image 2
Kingname Avatar answered Oct 23 '22 22:10

Kingname