I would like to split a string into parts that match a regexp pattern and parts that do not match into a list.
For example
import re
string = 'my_file_10'
pattern = r'\d+$'
# I know the matching pattern can be obtained with :
m = re.search(pattern, string).group()
print m
'10'
# The final result should be as following
['my_file_', '10']
Method : Using join regex + loop + re.match() In this, we create a new regex string by joining all the regex list and then match the string against it to check for match using match() with any of the element of regex list.
Use re.search() to extract a substring matching a regular expression pattern. Specify the regular expression pattern as the first parameter and the target string as the second parameter. \d matches a digit character, and + matches one or more repetitions of the preceding pattern.
match() function of re in Python will search the regular expression pattern and return the first occurrence. The Python RegEx Match method checks for a match only at the beginning of the string. So, if a match is found in the first line, it returns the match object.
Exact match (equality comparison): == , != As with numbers, the == operator determines if two strings are equal. If they are equal, True is returned; if they are not, False is returned. It is case-sensitive, and the same applies to comparisons by other operators and methods.
Put parenthesis around the pattern to make it a capturing group, then use re.split()
to produce a list of matching and non-matching elements:
pattern = r'(\d+$)'
re.split(pattern, string)
Demo:
>>> import re
>>> string = 'my_file_10'
>>> pattern = r'(\d+$)'
>>> re.split(pattern, string)
['my_file_', '10', '']
Because you are splitting on digits at the end of the string, an empty string is included.
If you only ever expect one match, at the end of the string (which the $
in your pattern forces here), then just use the m.start()
method to obtain an index to slice the input string:
pattern = r'\d+$'
match = re.search(pattern, string)
not_matched, matched = string[:match.start()], match.group()
This returns:
>>> pattern = r'\d+$'
>>> match = re.search(pattern, string)
>>> string[:match.start()], match.group()
('my_file_', '10')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With