Consider the following examples: <pre class="prettyprint"><code>string_now = 'apple and avocado' stringthen = string_now.swap('apple', 'avocado') # stringthen = 'avocado and apple' </code></pre> and: <pre class="prettyprint"><code>string_now = 'fffffeeeeeddffee' stringthen = string_now.swap('fffff', 'eeeee') # stringthen = 'eeeeefffffddffee' </code></pre> Approaches discussed in Swap character of string in Python do not work, as the mapping technique used there only takes one character into consideration. Python's built-in <code>str.maketrans()</code> also only supports one-character translations, as when I try to do multiple characters, it throws the following error: <pre class="prettyprint"><code>Traceback (most recent call last): File "main.py", line 4, in <module> s.maketrans(mapper) ValueError: string keys in translate table must be of length 1 </code></pre> A chain of <code>replace()</code> methods is not only far from ideal (since I have many replacements to do, chaining replaces would be a big chunk of code) but because of its sequential nature, it will not translate things perfectly as: <pre class="prettyprint"><code>string_now = 'apple and avocado' stringthen = string_now.replace('apple', 'avocado').replace('avocado', 'apple') </code></pre> gives <code>'apple and apple'</code> instead of <code>'avocado and apple'</code>. What's the best way to achieve this?

Two regex solutions and one for other people who do have a character that can't appear (there are over a million different possible characters, after all) and who don't dislike <code>replace</code> chains :-) <pre class="prettyprint"><code>def swap_words_regex1(s, x, y): return re.sub(re.escape(x) + '|' + re.escape(y), lambda m: (x if m[0] == y else y), s) def swap_words_regex2(s, x, y): return re.sub(f'({re.escape(x)})|{re.escape(y)}', lambda m: x if m[1] is None else y, s) def swap_words_replaces(s, x, y): return s.replace(x, chr(0)).replace(y, x).replace(chr(0), y) </code></pre> Some benchmark results: <pre class="prettyprint"><code> 3.7 ms 1966 kB swap_words_split 10.7 ms 2121 kB swap_words_regex1 17.8 ms 2121 kB swap_words_regex2 1.3 ms 890 kB swap_words_replaces </code></pre> Full code (Try it online!): <pre class="prettyprint"><code>from timeit import repeat import re import tracemalloc as tm def swap_words_split(s, x, y): return y.join(part.replace(y, x) for part in s.split(x)) def swap_words_regex1(s, x, y): return re.sub(re.escape(x) + '|' + re.escape(y), lambda m: (x if m[0] == y else y), s) def swap_words_regex2(s, x, y): return re.sub(f'({re.escape(x)})|{re.escape(y)}', lambda m: x if m[1] is None else y, s) def swap_words_replaces(s, x, y): return s.replace(x, chr(0)).replace(y, x).replace(chr(0), y) funcs = swap_words_split, swap_words_regex1, swap_words_regex2, swap_words_replaces args = 'apples and avocados and bananas and oranges and ' * 10000, 'apples', 'avocados' for _ in range(3): for func in funcs: t = min(repeat(lambda: func(*args), number=1)) tm.start() func(*args) memory = tm.get_traced_memory()[1] tm.stop() print(f'{t * 1e3:4.1f} ms {memory // 1000:4} kB {func.__name__}') print() </code></pre>

Python best way to 'swap' words (multiple characters) in a string?

Tags:

python

string

python-3.x

Consider the following examples:

string_now = 'apple and avocado'
stringthen = string_now.swap('apple', 'avocado') # stringthen = 'avocado and apple'

and:

string_now = 'fffffeeeeeddffee'
stringthen = string_now.swap('fffff', 'eeeee') # stringthen = 'eeeeefffffddffee'

Approaches discussed in Swap character of string in Python do not work, as the mapping technique used there only takes one character into consideration. Python's built-in str.maketrans() also only supports one-character translations, as when I try to do multiple characters, it throws the following error:

Traceback (most recent call last):
  File "main.py", line 4, in <module>
    s.maketrans(mapper)
ValueError: string keys in translate table must be of length 1

A chain of replace() methods is not only far from ideal (since I have many replacements to do, chaining replaces would be a big chunk of code) but because of its sequential nature, it will not translate things perfectly as:

string_now = 'apple and avocado'
stringthen = string_now.replace('apple', 'avocado').replace('avocado', 'apple')

gives 'apple and apple' instead of 'avocado and apple'.

What's the best way to achieve this?

884

asked Dec 03 '21 03:12

Hamza

4 Answers

Given that we want to swap words x and y, and that we don't care about the situation where they overlap, we can:

split the string on occurrences of x
within each piece, replace y with x
join the pieces with y

Essentially, we use split points within the string as a temporary marker to avoid the problem with sequential replacements.

Thus:

def swap_words(s, x, y):
    return y.join(part.replace(y, x) for part in s.split(x))

Test it:

>>> swap_words('apples and avocados and avocados and apples', 'apples', 'avocados')
'avocados and apples and apples and avocados'
>>>

146

answered Oct 23 '22 23:10

Karl Knechtel

Two regex solutions and one for other people who do have a character that can't appear (there are over a million different possible characters, after all) and who don't dislike replace chains :-)

def swap_words_regex1(s, x, y):
    return re.sub(re.escape(x) + '|' + re.escape(y),
                  lambda m: (x if m[0] == y else y),
                  s)

def swap_words_regex2(s, x, y):
    return re.sub(f'({re.escape(x)})|{re.escape(y)}',
                  lambda m: x if m[1] is None else y,
                  s)

def swap_words_replaces(s, x, y):
    return s.replace(x, chr(0)).replace(y, x).replace(chr(0), y)

Some benchmark results:

 3.7 ms  1966 kB  swap_words_split
10.7 ms  2121 kB  swap_words_regex1
17.8 ms  2121 kB  swap_words_regex2
 1.3 ms   890 kB  swap_words_replaces

Full code (Try it online!):

from timeit import repeat
import re
import tracemalloc as tm

def swap_words_split(s, x, y):
    return y.join(part.replace(y, x) for part in s.split(x))

def swap_words_regex1(s, x, y):
    return re.sub(re.escape(x) + '|' + re.escape(y),
                  lambda m: (x if m[0] == y else y),
                  s)

def swap_words_regex2(s, x, y):
    return re.sub(f'({re.escape(x)})|{re.escape(y)}',
                  lambda m: x if m[1] is None else y,
                  s)

def swap_words_replaces(s, x, y):
    return s.replace(x, chr(0)).replace(y, x).replace(chr(0), y)

funcs = swap_words_split, swap_words_regex1, swap_words_regex2, swap_words_replaces

args = 'apples and avocados and bananas and oranges and ' * 10000, 'apples', 'avocados'

for _ in range(3):
    for func in funcs:
        t = min(repeat(lambda: func(*args), number=1))
        tm.start()
        func(*args)
        memory = tm.get_traced_memory()[1]
        tm.stop()
        print(f'{t * 1e3:4.1f} ms  {memory // 1000:4} kB  {func.__name__}')
    print()

answered Oct 24 '22 00:10

Kelly Bundy

This solution uses `str.format()`:

string_now = "apple and avocado"
stringthen = (  # "avocado and apple"
    string_now.replace("apple", "{apple}")
    .replace("avocado", "{avocado}")
    .format(apple="avocado", avocado="apple")
)

# Edit: as a function
def swap_words(s, x, y):
    return s.replace(x, "{" + x + "}")
            .replace(y, "{" + y + "}")
            .format(**{x: y, y: x})

It first adds curly brackets before and after keywords to turn them into placeholders. Then str.format() is used to replace the placeholders.

answered Oct 23 '22 22:10

Stefan_EOX

Why not just use a temp string which will never be in the origin string?

for example:

>>> a = 'apples and avocados and avocados and apples'
>>> b = a.replace('apples', '#IamYourFather#').replace('avocados', 'apples').replace('#IamYourFather#', 'avocados')
>>> print(b)
avocados and apples and apples and avocados

where #IamYourFather# is a string which will never be in the origin string.

answered Oct 23 '22 22:10

Kingname

Related questions
                            
                                Correlation coefficient of two columns in pandas dataframe with .corr()
                            
                                I want to flatten JSON column in a Pandas DataFrame
                            
                                python argparse how to get entire command as string
                            
                                Updating cell values with formulas results in apostrophe prefixes with Sheets API
                            
                                ImportError: No module named 'flask_sqlalchemy' w/ 2 Versions of Python Installed
                            
                                Quick way to check if the pandas series contains a negative value
                            
                                How to detect and remove outliers from each column of pandas dataframe at one go? [duplicate]
                            
                                How to choose randomly between two values? [duplicate]
                            
                                TensorFlow 2.0 dataset.__iter__() is only supported when eager execution is enabled
                            
                                'utf-8' codec can't decode byte 0xe2 : invalid continuation byte error
                            
                                Discord.py - SyntaxError f-string: empty expression not allowed
                            
                                How to scrape through Single page Application websites in python using bs4
                            
                                Can I have a simple list of a dataclass field
                            
                                Unsupported operand type(s) for +: 'WindowsPath' and 'str'
                            
                                Autocomplete in Jupyter notebook not working
                            
                                Find entries that do not match between columns and iterate through columns
                            
                                Return aggregate for all unique in a group
                            
                                How to deal with multi-level column names downloaded with yfinance
                            
                                VSCode Jupyter Extension: Rich syntax highlighting not working?
                            
                                Check if list is valid sequence of chunks

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python best way to 'swap' words (multiple characters) in a string?

Tags:

python

string

python-3.x

Hamza

People also ask

4 Answers

Karl Knechtel

Kelly Bundy

This solution uses `str.format()`:

Stefan_EOX

Kingname

Recent Activity

Donate For Us

Python best way to 'swap' words (multiple characters) in a string?

Tags:

python

string

python-3.x

Hamza

People also ask

4 Answers

Karl Knechtel

Kelly Bundy

This solution uses str.format():

Stefan_EOX

Kingname

Related questions

Recent Activity

Donate For Us

This solution uses `str.format()`: