Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find and replace nth occurrence of word in a sentence using python regular expression?

Tags:

python

regex

Using python regular expression only, how to find and replace nth occurrence of word in a sentence? For example:

str = 'cat goose  mouse horse pig cat cow'
new_str = re.sub(r'cat', r'Bull', str)
new_str = re.sub(r'cat', r'Bull', str, 1)
new_str = re.sub(r'cat', r'Bull', str, 2)

I have a sentence above where the word 'cat' appears two times in the sentence. I want 2nd occurence of the 'cat' to be changed to 'Bull' leaving 1st 'cat' word untouched. My final sentence would look like: "cat goose mouse horse pig Bull cow". In my code above I tried 3 different times could not get what I wanted.

like image 944
juggernaut Avatar asked Dec 21 '14 12:12

juggernaut


People also ask

How do you replace all occurrences of a regex pattern in a string in Python?

sub() method will replace all pattern occurrences in the target string. By setting the count=1 inside a re. sub() we can replace only the first occurrence of a pattern in the target string with another string. Set the count value to the number of replacements you want to perform.

How do you find the nth occurrence of a string in Python?

Practical Data Science using Python You can find the nth occurrence of a substring in a string by splitting at the substring with max n+1 splits. If the resulting list has a size greater than n+1, it means that the substring occurs more than n times.

How do you replace only one occurrence of a string in Python?

replace (old, new[, count]) -> string Return a copy of string S with all occurrences of substring old replaced by new. If the optional argument count is given, only the first count occurrences are replaced.


1 Answers

Use negative lookahead like below.

>>> s = "cat goose  mouse horse pig cat cow"
>>> re.sub(r'^((?:(?!cat).)*cat(?:(?!cat).)*)cat', r'\1Bull', s)
'cat goose  mouse horse pig Bull cow'

DEMO

  • ^ Asserts that we are at the start.
  • (?:(?!cat).)* Matches any character but not of cat , zero or more times.
  • cat matches the first cat substring.
  • (?:(?!cat).)* Matches any character but not of cat , zero or more times.
  • Now, enclose all the patterns inside a capturing group like ((?:(?!cat).)*cat(?:(?!cat).)*), so that we could refer those captured chars on later.
  • cat now the following second cat string is matched.

OR

>>> s = "cat goose  mouse horse pig cat cow"
>>> re.sub(r'^(.*?(cat.*?){1})cat', r'\1Bull', s)
'cat goose  mouse horse pig Bull cow'

Change the number inside the {} to replace the first or second or nth occurrence of the string cat

To replace the third occurrence of the string cat, put 2 inside the curly braces ..

>>> re.sub(r'^(.*?(cat.*?){2})cat', r'\1Bull', "cat goose  mouse horse pig cat foo cat cow")
'cat goose  mouse horse pig cat foo Bull cow'

Play with the above regex on here ...

like image 99
Avinash Raj Avatar answered Oct 12 '22 22:10

Avinash Raj