Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to match alphabetical chars without numeric chars with Python regexp?

Tags:

python

regex

Using Python module re, how to get the equivalent of the "\w" (which matches alphanumeric chars) WITHOUT matching the numeric characters (those which can be matched by "[0-9]")?

Notice that the basic need is to match any character (including all unicode variation) without numerical chars (which are matched by "[0-9]").

As a final note, I really need a regexp as it is part of a greater regexp.

Underscores should not be matched.

EDIT:

  • I hadn't thought about underscores state, so thanks for warnings about this being matched by "\w" and for the elected solution that addresses this issue.
like image 510
vaab Avatar asked Nov 04 '09 13:11

vaab


2 Answers

You want [^\W\d]: the group of characters that is not (either a digit or not an alphanumeric). Add an underscore in that negated set if you don't want them either.

A bit twisted, if you ask me, but it works. Should be faster than the lookahead alternative.

like image 114
Alice Purcell Avatar answered Sep 16 '22 22:09

Alice Purcell


(?!\d)\w

A position that is not followed by a digit, and then \w. Effectively cancels out digits but allows the \w range by using a negative look-ahead.

The same could be expressed as a positive look-ahead and \D:

(?=\D)\w

To match multiple of these, enclose in parens:

(?:(?!\d)\w)+
like image 40
Tomalak Avatar answered Sep 18 '22 22:09

Tomalak