I would like to split a string into parts that match a regexp pattern and parts that do not match into a list. For example <pre class="prettyprint"><code>import re string = 'my_file_10' pattern = r'\d+$' # I know the matching pattern can be obtained with : m = re.search(pattern, string).group() print m '10' # The final result should be as following ['my_file_', '10'] </code></pre>

Put parenthesis around the pattern to make it a capturing group, then use <code>re.split()</code> to produce a list of matching and non-matching elements: <pre class="prettyprint"><code>pattern = r'(\d+$)' re.split(pattern, string) </code></pre> Demo: <pre class="prettyprint"><code>>>> import re >>> string = 'my_file_10' >>> pattern = r'(\d+$)' >>> re.split(pattern, string) ['my_file_', '10', ''] </code></pre> Because you are splitting on digits at the end of the string, an empty string is included. If you only ever expect one match, at the end of the string (which the <code>$</code> in your pattern forces here), then just use the <code>m.start()</code> method to obtain an index to slice the input string: <pre class="prettyprint"><code>pattern = r'\d+$' match = re.search(pattern, string) not_matched, matched = string[:match.start()], match.group() </code></pre> This returns: <pre class="prettyprint"><code>>>> pattern = r'\d+$' >>> match = re.search(pattern, string) >>> string[:match.start()], match.group() ('my_file_', '10') </code></pre>

python return matching and non-matching patterns of string

Tags:

python

string

regex

I would like to split a string into parts that match a regexp pattern and parts that do not match into a list.

For example

import re
string = 'my_file_10'
pattern = r'\d+$'
#  I know the matching pattern can be obtained with :
m = re.search(pattern, string).group()
print m
'10'
#  The final result should be as following
['my_file_', '10']

520

asked Jun 27 '14 17:06

user1850133

1 Answers

Put parenthesis around the pattern to make it a capturing group, then use re.split() to produce a list of matching and non-matching elements:

pattern = r'(\d+$)'
re.split(pattern, string)

Demo:

>>> import re
>>> string = 'my_file_10'
>>> pattern = r'(\d+$)'
>>> re.split(pattern, string)
['my_file_', '10', '']

Because you are splitting on digits at the end of the string, an empty string is included.

If you only ever expect one match, at the end of the string (which the $ in your pattern forces here), then just use the m.start() method to obtain an index to slice the input string:

pattern = r'\d+$'
match = re.search(pattern, string)
not_matched, matched = string[:match.start()], match.group()

This returns:

>>> pattern = r'\d+$'
>>> match = re.search(pattern, string)
>>> string[:match.start()], match.group()
('my_file_', '10')

102

answered Sep 24 '22 00:09

Martijn Pieters

Related questions
                            
                                pd.to_datetime change date format producing wrong dates
                            
                                Confusing error when trying to run Python script
                            
                                Thread condition variables: un-acquired lock
                            
                                no module named ecdsa with Paramiko
                            
                                PyCharm doesn't recognize my Python installation path
                            
                                Simulating ajax POST call using Python Requests
                            
                                Ipython notebook align Latex equations in Ipython.Display module
                            
                                Django urls.py, what does the name parameter do?
                            
                                Insert Python datetime to Oracle column of type DATE
                            
                                in Python 2.x, why is the > operator supported between function and int? [duplicate]
                            
                                Extract decision boundary with scikit-learn linear SVM
                            
                                Python - Aggregate by month and calculate average
                            
                                Django Celery Settings Import Issue
                            
                                How to install Numpy on Windows 8, in pyvenv?
                            
                                parsing a string in python: how to split newlines while ignoring newline inside quotes
                            
                                Python docx library text align
                            
                                L suffix in long integer in Python 3.x
                            
                                How to show residual in the bottom of a matplotlib plot
                            
                                How to benchmark unit tests in Python without adding any code
                            
                                ImportError: No module named flask

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With