Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: Regex to find but not include an alphanumeric

Tags:

python

regex

Is there an regular expression to find, for example, ">ab" but do not include ">" in the result?

I want to replace some strings using re.sub, and I want to find strings starting with ">" without remove the ">".

like image 890
The Student Avatar asked Feb 28 '26 05:02

The Student


2 Answers

You want a positive lookbehind assertion. See the docs.

r'(?<=>)ab'

It needs to be a fixed length expression, it can't be a variable number of characters. Basically, do

r'(?<=stringiwanttobebeforethematch)stringiwanttomatch'

So, an example:

import re

# replace 'ab' with 'e' if it has '>' before it

#here we've got '>ab' so we'll get '>ecd'
print re.sub(r'(?<=>)ab', 'e', '>abcd') 

#here we've got 'ab' but no '>' so we'll get 'abcd'
print re.sub(r'(?<=>)ab', 'e', 'abcd') 
like image 171
agf Avatar answered Mar 01 '26 17:03

agf


You can use a back reference in sub:

import re
test = """
>word
>word2
don't replace
"""
print re.sub('(>).*', r'\1replace!', test)

Outputs:

>replace!
>replace!
don't replace

I believe this accomplishes what you actually want when you say "I want to replace some strings using re.sub, and I want to find strings starting with '>' without remove the '>'."

like image 39
Brent Newey Avatar answered Mar 01 '26 18:03

Brent Newey