Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting groups in a regex match

Tags:

python

regex

I have a set of inputs. I am trying to write a regex to match the following pattern in the input:

Day at Time on location

Example input:

Today at 12:30 PM on Sam's living room

The bolded part of the text varies in each input.

I wrote the following regex:

import regex as re

input_example = "Today at 12:30 PM on Rakesh's Echo"
regexp_1 = re.compile(r'(\w+) at (\d+):(\d+) (\w+) on (\w+)')
re_match = regexp_1.match(input_example)

Which works, I am matching the correct patterns. I am now trying to extract groups from within the pattern.

My desired output is:

re_match.group(1)
>> "Today"
re_match.group(2)
>> "12:30 PM"
re_match.group(3)
>> "Sam's living room"

However, my current regular expression match does not give me this output. What is the correct regex that will give me the above outputs?

like image 263
Rakesh Adhikesavan Avatar asked Apr 16 '18 15:04

Rakesh Adhikesavan


People also ask

How do I match a group in regex?

Capturing groups are a way to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside a set of parentheses. For example, the regular expression (dog) creates a single group containing the letters "d", "o", and "g".

What is capturing group in regex python?

Capturing groups are a handy feature of regular expression matching that allows us to query the Match object to find out the part of the string that matched against a particular part of the regular expression. Anything you have in parentheses () will be a capture group.

What is capturing group in regex JavaScript?

Introduction to the JavaScript regex capturing groups \w+ is a word character set with a quantifier (+) that matches one or more word characters.

When capturing regex groups what datatype does the groups method return?

The re. groups() method This method returns a tuple containing all the subgroups of the match, from 1 up to however many groups are in the pattern.


1 Answers

You can make nested groups, but in that way it would be not very readable, because you have to compute the exact number of the group and then you will forget what exactly means that number.

It's better to use named groups. This is copied from the REPL:

>>> import re
... 
... input_example = "Today at 12:30 PM on Rakesh's Echo"
... regexp_1 = re.compile(r'(?P<day>\w+) at (?P<time>(\d+):(\d+) (\w+)) on (?P<place>\w+)')
... re_match = regexp_1.match(input_example)
>>> list(re_match.groups())
['Today', '12:30 PM', '12', '30', 'PM', 'Rakesh']
>>> re_match.group('day')
'Today'
>>> re_match.group('time')
'12:30 PM'
>>> re_match.group('place')
'Rakesh'
like image 57
Mariy Avatar answered Sep 30 '22 17:09

Mariy