How to extract slug from URL with regular expression in Python?

Question

I'm struggling with Python's re. I don't know how to solve the following problem in a clean way.

I want to extract a part of an URL,

What I tried so far:

url = http://www.example.com/this-2-me-4/123456-subj
m = re.search('/[0-9]+-', url)
m = m.group(0).rstrip('-')
m = m.lstrip('/')

This leaves me with the desired output 123456, but I feel this is not the proper way to extract the slug.

How can I solve this quicker and cleaner?

Stefan van den Akker · Accepted Answer

Use a capturing group by putting parentheses around the part of the regex that you want to capture (...). You can get the contents of a capturing group by passing in its number as an argument to m.group():

>>> m = re.search('/([0-9]+)-', url)
>>> m.group(1) 
123456

From the docs:

(...)
Matches whatever regular expression is inside the parentheses, and indicates the start and end of a group; the contents of a group can be retrieved after a match has been performed, and can be matched later in the string with the \number special sequence, described below. To match the literals '(' or ')', use $ or $, or enclose them inside a character class: [(] [)].

How to extract slug from URL with regular expression in Python?

Tags:

python

regex

mcbetz

1 Answers

Stefan van den Akker

Recent Activity

Donate For Us

How to extract slug from URL with regular expression in Python?

Tags:

python

regex

mcbetz

1 Answers

Stefan van den Akker

Related questions

Recent Activity

Donate For Us