Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to validate unicode "word characters" in Python regex?

I have form with field:

name = forms.RegexField(regex=r'\w+$', label=u'Name', required=True)

but if I type special chars (ś for example) form not pass is_valid() function. How to do it?

like image 954
Nips Avatar asked Oct 01 '12 06:10

Nips


People also ask

Does regex work with Unicode?

RegexBuddy's regex engine is fully Unicode-based starting with version 2.0. 0.

What does \s mean in regex?

The regular expression \s is a predefined character class. It indicates a single whitespace character. Let's review the set of whitespace characters: [ \t\n\x0B\f\r]

What is \b in python regex?

Inside a character range, \b represents the backspace character, for compatibility with Python's string literals. Matches the empty string, but only when it is not at the beginning or end of a word.


2 Answers

Activate Unicode matching for \w.

name = forms.RegexField(regex=r'(?u)\w+$', label=u'Name', required=True)
like image 194
Ignacio Vazquez-Abrams Avatar answered Sep 21 '22 18:09

Ignacio Vazquez-Abrams


Instead of defining the regex as a string, you can compile it to a regex object first, setting the re.U flag:

import re

name_regex = re.compile(r'\w+$', re.U)
name = forms.RegexField(regex=name_regex, label=u'Name', required=True)
like image 30
Benjamin Wohlwend Avatar answered Sep 21 '22 18:09

Benjamin Wohlwend