Python extract pattern matches

People also ask

How do I extract a specific pattern from a string in Python?

Use re.search() to extract a substring matching a regular expression pattern. Specify the regular expression pattern as the first parameter and the target string as the second parameter. \d matches a digit character, and + matches one or more repetitions of the preceding pattern.

What is the difference between re search and re match?

There is a difference between the use of both functions. Both return the first match of a substring found in the string, but re. match() searches only from the beginning of the string and return match object if found. But if a match of substring is found somewhere in the middle of the string, it returns none.

What does re match return Python?

Python re. match() method looks for the regex pattern only at the beginning of the target string and returns match object if match found; otherwise, it will return None.

You need to capture from regex. search for the pattern, if found, retrieve the string using group(index). Assuming valid checks are performed:

>>> p = re.compile("name (.*) is valid")
>>> result = p.search(s)
>>> result
<_sre.SRE_Match object at 0x10555e738>
>>> result.group(1)     # group(1) will return the 1st capture (stuff within the brackets).
                        # group(0) will returned the entire matched text.
'my_user_name'

You can use matching groups:

p = re.compile('name (.*) is valid')

e.g.

>>> import re
>>> p = re.compile('name (.*) is valid')
>>> s = """
... someline abc
... someother line
... name my_user_name is valid
... some more lines"""
>>> p.findall(s)
['my_user_name']

Here I use re.findall rather than re.search to get all instances of my_user_name. Using re.search, you'd need to get the data from the group on the match object:

>>> p.search(s)   #gives a match object or None if no match is found
<_sre.SRE_Match object at 0xf5c60>
>>> p.search(s).group() #entire string that matched
'name my_user_name is valid'
>>> p.search(s).group(1) #first group that match in the string that matched
'my_user_name'

As mentioned in the comments, you might want to make your regex non-greedy:

p = re.compile('name (.*?) is valid')

to only pick up the stuff between 'name ' and the next ' is valid' (rather than allowing your regex to pick up other ' is valid' in your group.

You could use something like this:

import re
s = #that big string
# the parenthesis create a group with what was matched
# and '\w' matches only alphanumeric charactes
p = re.compile("name +(\w+) +is valid", re.flags)
# use search(), so the match doesn't have to happen 
# at the beginning of "big string"
m = p.search(s)
# search() returns a Match object with information about what was matched
if m:
    name = m.group(1)
else:
    raise Exception('name not found')

Maybe that's a bit shorter and easier to understand:

import re
text = '... someline abc... someother line... name my_user_name is valid.. some more lines'
>>> re.search('name (.*) is valid', text).group(1)
'my_user_name'

You can use groups (indicated with '(' and ')') to capture parts of the string. The match object's group() method then gives you the group's contents:

>>> import re
>>> s = 'name my_user_name is valid'
>>> match = re.search('name (.*) is valid', s)
>>> match.group(0)  # the entire match
'name my_user_name is valid'
>>> match.group(1)  # the first parenthesized subgroup
'my_user_name'

In Python 3.6+ you can also index into a match object instead of using group():

>>> match[0]  # the entire match 
'name my_user_name is valid'
>>> match[1]  # the first parenthesized subgroup
'my_user_name'

Related questions
                            
                                Unable to import a module that is definitely installed
                            
                                How to update a plot in matplotlib?
                            
                                How do I upgrade the Python installation in Windows 10?
                            
                                Does SQLAlchemy have an equivalent of Django's get_or_create?
                            
                                pandas dataframe columns scaling with sklearn
                            
                                Ignoring NaNs with str.contains
                            
                                How to read first N lines of a file?
                            
                                Suppress Scientific Notation in Numpy When Creating Array From Nested List
                            
                                How to set the timezone in Django?
                            
                                Where's my JSON data in my incoming Django request?
                            
                                How to condense if/else into one line in Python? [duplicate]
                            
                                How to remove all characters after a specific character in python?
                            
                                python list by value not by reference [duplicate]
                            
                                Why is TensorFlow 2 much slower than TensorFlow 1?
                            
                                How to scp in Python?
                            
                                Looping over a list in Python
                            
                                Web scraping with Python [closed]
                            
                                Python Threading String Arguments
                            
                                How to efficiently compare two unordered lists (not sets) in Python?
                            
                                Generating matplotlib graphs without a running X server [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python extract pattern matches

Tags:

python

regex

People also ask

Recent Activity

Donate For Us