I'm struggling with Python's re
. I don't know how to solve the following problem in a clean way.
I want to extract a part of an URL,
What I tried so far:
url = http://www.example.com/this-2-me-4/123456-subj
m = re.search('/[0-9]+-', url)
m = m.group(0).rstrip('-')
m = m.lstrip('/')
This leaves me with the desired output 123456
, but I feel this is not the proper way to extract the slug.
How can I solve this quicker and cleaner?
Use a capturing group by putting parentheses around the part of the regex that you want to capture (...)
. You can get the contents of a capturing group by passing in its number as an argument to m.group()
:
>>> m = re.search('/([0-9]+)-', url)
>>> m.group(1)
123456
From the docs:
(...)
Matches whatever regular expression is inside the parentheses, and indicates the start and end of a group; the contents of a group can be retrieved after a match has been performed, and can be matched later in the string with the\number
special sequence, described below. To match the literals'('
or')'
, use\(
or\)
, or enclose them inside a character class:[(] [)]
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With