Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to search for the last occurrence of a regular expression in a string in python?

In python, I can easily search for the first occurrence of a regex within a string like this:

import re
re.search("pattern", "target_text")

Now I need to find the last occurrence of the regex in a string, this doesn't seems to be supported by re module.

I can reverse the string to "search for the first occurrence", but I also need to reverse the regex, which is a much harder problem.

I can also iterate to find all occurrences from left to right, and just keep the last one, but that looks awkward.

Is there a smart way to find the rightmost occurrence?

like image 751
NeoWang Avatar asked Oct 20 '15 09:10

NeoWang


People also ask

How do you find the last occurrence of a substring inside a string?

Using rindex() to find last occurrence of substring rindex() method returns the last occurrence of the substring if present in the string. The drawback of this function is that it throws the exception if there is no substring in the string and hence breaks the code.

What is the index of the last occurrence?

The lastIndexOf() method returns the index (position) of the last occurrence of a specified value in a string. The lastIndexOf() method searches the string from the end to the beginning. The lastIndexOf() method returns the index from the beginning (position 0).

How do I find the first occurrence of a character in a string in python?

Python3. Method #2 : Using List Slice + index() + list() One can convert the string to list using list() and then using list slicing we reverse the list and use the conventional index method to get the index of first occurrence of element.


2 Answers

One approach is to prefix the regex with (?s:.*) and force the engine to try matching at the furthest position and gradually backing off:

re.search("(?s:.*)pattern", "target_text")

Do note that the result of this method may differ from re.findall("pattern", "target_text")[-1], since the findall method searches for non-overlapping matches, and not all substrings which can be matched are included in the result.

For example, executing the regex a.a on abaca, findall would return aba as the only match and select it as the last match, while the code above will return aca as the match.


Yet another alternative is to use regex package, which supports REVERSE matching mode.

The result would be more or less the same as the method with (?s:.*) in re package as described above. However, since I haven't tried the package myself, it's not clear how backreference works in REVERSE mode - the pattern might require modification in such cases.

like image 84
nhahtdh Avatar answered Sep 23 '22 02:09

nhahtdh


import re
re.search("pattern(?!.*pattern)", "target_text")

or

import re
re.findall("pattern", "target_text")[-1]

You can use these 2 approaches.

If you want positions use

x="abc abc abc"
print [(i.start(),i.end(),i.group()) for i in re.finditer(r"abc",x)][-1]
like image 37
vks Avatar answered Sep 21 '22 02:09

vks