Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get Group Match in re.sub in Python

Tags:

python

regex

perl

In Perl, it is possible for me to do a substitution and capture a group match at the same time. e.g.

my $string = "abcdef123";
$string =~ s/(\d+)//;
my $groupMatched = $1; # $groupMatched is 123

In Python, I can do the substitution using re.sub function as follows. However, I cannot find a way to capture the \d+ group match without invoking another function re.match and performing an additional operation.

string = "abcdef123"
string = re.sub("(\d+)", "", string)

Does anyone know how I can capture the "\d+" matched value as a separate variable from the same re.sub operation? I tried the following command and it doesn't work.

print r'\1'
like image 696
KT8 Avatar asked Mar 24 '16 08:03

KT8


People also ask

WHAT IS RE sub () in Python?

sub() function belongs to the Regular Expressions ( re ) module in Python. It returns a string where all matching occurrences of the specified pattern are replaced by the replace string.

What is Match Group () in Python?

Match objects in Python regex match. group() returns the match from the string. This would be a15 in our first example. match. start() and match.

Can you use regex in replace Python?

Regex can be used to perform various tasks in Python. It is used to do a search and replace operations, replace patterns in text, check if a string contains the specific pattern.

How do I reference a capture group in regex Python?

Normally, within a pattern, you create a back-reference to the content a capture group previously matched by using a backslash followed by the group number—for instance \1 for Group 1. (The syntax for replacements can vary.)


2 Answers

You can cheat and pass a function to re.sub:

results = []
def capture_and_kill(match):
    results.append(match)
    return ""
string = "abcdef123"
string = re.sub("(\d+)", capture_and_kill, string)
results[0].group(1)
# => '123'
like image 137
Amadan Avatar answered Oct 02 '22 21:10

Amadan


You can do the following:

sub_str = re.search("(\d+)", str).group(1)

Will find the "123" part.

Then you replace it:

str = str.replace(sub_str, "")

Note that if you have more than [0-9] sequence you'll need to use findall and iterate manually on all matches.

like image 25
Maroun Avatar answered Oct 02 '22 21:10

Maroun