Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python regex replace whole string

Tags:

python

regex

I have a regex to strip the end off a request url:

re.sub('(?:^\/en\/category).*(-\d{1,4}$)', '', r)

My problem is that the docs say it will replace the matched part, however when it matches my string it replaces the whole string, e.g.:

/en/category/specials/men-2610

I'm not sure what Python is doing, but my regex seems fine

EDIT: I wish to have the string with the end stripped off, target =

/en/category/specials/men
like image 969
Tjorriemorrie Avatar asked Sep 17 '25 09:09

Tjorriemorrie


2 Answers

As stated in the docs, the matched part is replaced. Matched is different from captured.

You will have to capture the text you don't want to remove in a capture group like so:

(^/en/category.*)-\d{1,4}$

and put it back into the string using the backreference \1:

re.sub(r'(^/en/category.*)-\d{1,4}$', r'\1', text)
like image 196
Aran-Fey Avatar answered Sep 18 '25 21:09

Aran-Fey


(?<=^\/en\/category)(.*)-\d{1,4}$

Try this.replace by \1.See demo.

https://regex101.com/r/tX2bH4/27

Your whole pattern matches that is why it is replacing the whole string.

P.S match is different than captures or groups.

import re
p = re.compile(r'(?<=^\/en\/category)(.*)-\d{1,4}$', re.IGNORECASE)
test_str = "/en/category/specials/men-2610"
subst = "\1"

result = re.sub(p, subst, test_str)
like image 25
vks Avatar answered Sep 18 '25 21:09

vks