Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex match for a non-english language in Python

I'm trying to capture and match russian language characters in a python script. Since russian characters don't fall in [a-Z] type, what regex should I should to match them. I can't use a (.*) because it would match everything.

linkpat = re.compile('name=[a-Z]+;size=[0-9]+')
like image 741
Neo Avatar asked Dec 15 '25 09:12

Neo


2 Answers

Use unicode flag:

re.compile('name=\w+;size=\d+', re.U)

this would also match any letter in any language (plus underscore), not just Russian, though.

like image 58
SilentGhost Avatar answered Dec 17 '25 00:12

SilentGhost


You can try \w with the correct LOCALE

like image 30
eumiro Avatar answered Dec 17 '25 00:12

eumiro



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!