Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Python's regex .match() method to get the string before and after an underscore

I have the following code:

tablesInDataset = ["henry_jones_12345678", "henry_jones", "henry_jones_123"]

for table in tablesInDataset:
    tableregex = re.compile("\d{8}")
    tablespec = re.match(tableregex, table)

    everythingbeforedigits = tablespec.group(0)
    digits = tablespec.group(1)

My regex should only return the string if it contains 8 digits after an underscore. Once it returns the string, I want to use .match() to get two groups using the .group() method. The first group should contain a string will all of the characters before the digits and the second should contain a string with the 8 digits.

What is the correct regex to get the results I am looking for using .match() and .group()?

like image 912
Erik Åsland Avatar asked Dec 15 '22 04:12

Erik Åsland


1 Answers

Use capture groups:

>>> import re
>>> pat = re.compile(r'(?P<name>.*)_(?P<number>\d{8})')
>>> pat.findall(s)
[('henry_jones', '12345678')]

You get the nice feature of named groups, if you want it:

>>> match = pat.match(s)
>>> match.groupdict()
{'name': 'henry_jones', 'number': '12345678'}
like image 173
wim Avatar answered May 26 '23 18:05

wim