I need to make a URL pattern which could work with this URL:
mysite.com/blog/12/بلاگ-مثال
It contains utf-8 characters so I tried using \X:
re_path(r'^blog/?P<blog_id>[\d+]+/(?P<slug>[\X.*]+)/$', views.single_blog, name='single_blog')
But it didn't work. I don't know why. Maybe just because I'm not good in regex. So I tried a different pattern using just .* to accept anything:
re_path(r'^blog/?P<blog_id>[\d+]+/(?P<slug>[.*]+)/$', views.single_blog, name='single_blog')
But this also doesn't work and I get:
The current path, blog/12/بلاگ-مثال, didn't match any of these.
So as I mentioned I'm not good in regex, what's the right way to fix this?
Is it the right time to say now I have two problems or regex is the only way?
Your approach to match something did not work since \X is not supported by Python re and [.*]+ matches 1+ dots or asterisks, but not any chars (because you put .* into [...] character class where they denote literal symbols, not special chars).
Besides, [\d+]+ is also a character class matching any digit or +, 1 or more times, so there is also a problem.
You may use a [^/] negated character class to match any char but /:
r'^blog/(?P<blog_id>\d+)/(?P<slug>[^/]+)/?$'
Details
^ - start of inputblog/ - a literal substrig(?P<blog_id>\d+) - Group "blog_id": 1+ digits/ - a /(?P<slug>[^/]+) - Group "slug": 1+ chars other than //? - an optional / $ - end of string.Here is a regex demo (note highlighting characters from the Arabic script is not working there.)
Is it the right time to say now I have two problems ...
In fact, you have chosen the right job for this task.
The other answer seems valid but can't tolerate to have the word Persian in it. I'm posting this answer to throw some points of why your own regex doesn't work as expected.
?P<blog_id>[\d+]+Probably you meant a named group here, the same as the one you used later in regex. You missed opening and closing parentheses: (?P<blog_id>[\d+]+). Also [\d+] means a character class consisted of digits and +. You need to remove +: (?P<blog_id>[0-9]+)
(?P<slug>[\X.*]+)Construction is fine as it should be but character class is not. \X doesn't have a special meaning in a character class, let alone Python that doesn't support it by its re module even. .* is no exception. In a character class almost all special tokens are treated literally.
So [\X.*] matches a X or a . or an asterisk *. You need to change it to something more general like [^/]+ which means match up to the first slash (= match anything except forward slash).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With