Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python regex: Including whitespace inside character range

I have a regular expression that matches alphabets, numbers, _ and - (with a minimum and maximum length).

^[a-zA-Z0-9_-]{3,100}$

I want to include whitespace in that set of characters.

According to the Python documentation:

Character classes such as \w or \S are also accepted inside a set.

So I tried:

^[a-zA-Z0-9_-\s]{3,100}$

But it gives bad character range error. How can I include whitespace in the above set?

like image 346
Nitish Parkar Avatar asked Oct 30 '12 15:10

Nitish Parkar


People also ask

How do you mention a space in regex python?

\s | Matches whitespace characters, which include the \t , \n , \r , and space characters. \S | Matches non-whitespace characters.

Does \w include spaces?

\W means "non-word characters", the inverse of \w , so it will match spaces as well.

Which regex matches only a white space character in Python?

The metacharacter '\s' is used for matching whitespaces in python using regular expressions. The most common functions used in RegEx are findall(), search(), split(), and sub().

What does \d include in regex?

Decimal digit character: \d \d matches any decimal digit. It is equivalent to the \p{Nd} regular expression pattern, which includes the standard decimal digits 0-9 as well as the decimal digits of a number of other character sets.


2 Answers

The problem is not the \s but the - which indicates a character range, unless it is at the end or start of the class. Use this:

^[a-zA-Z0-9_\s-]{3,100}$
like image 162
Martin Ender Avatar answered Oct 24 '22 21:10

Martin Ender


^[-a-zA-Z0-9_\s]{3,100}

_-\s was interpreted as a range. A dash representing itself has to be the first or last character inside [...]

like image 25
dda Avatar answered Oct 24 '22 21:10

dda