Get Group Match in re.sub in Python





In Perl, it is possible for me to do a substitution and capture a group match at the same time. e.g.

my $string = "abcdef123";
$string =~ s/(\d+)//;
my $groupMatched = $1; # $groupMatched is 123

In Python, I can do the substitution using re.sub function as follows. However, I cannot find a way to capture the \d+ group match without invoking another function re.match and performing an additional operation.

string = "abcdef123"
string = re.sub("(\d+)", "", string)

Does anyone know how I can capture the "\d+" matched value as a separate variable from the same re.sub operation? I tried the following command and it doesn't work.

print r'\1'
2 Answers

You can cheat and pass a function to re.sub:

results = []
def capture_and_kill(match):
    return ""
string = "abcdef123"
string = re.sub("(\d+)", capture_and_kill, string)
# => '123'
You can do the following:

sub_str = re.search("(\d+)", str).group(1)

Will find the "123" part.

Then you replace it:

str = str.replace(sub_str, "")

Note that if you have more than [0-9] sequence you'll need to use findall and iterate manually on all matches.

