Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: POSIX character class in regex?

How can I search for, say, a sequence of 10 isprint characters in a given string in Python?

With GNU grep, I would simply do grep [[:print:]]{10}

like image 233
nodakai Avatar asked Aug 10 '15 08:08

nodakai


People also ask

What are regex character classes in Python?

In Python, regex character classes are sets of characters or ranges of characters enclosed by square brackets []. For example, [a-z] it means match any lowercase letter from a to z. Let’s see some of the most common character classes used inside regular expression patterns. Match letter a or b or c followed by either p or q.

How many character classes are there in POSIX?

The POSIX standard defines 12 character classes. The table below lists all 12, plus the [:ascii:] and [:word:] classes that some regex flavors also support. The table also shows equivalent character classes that you can use in ASCII and Unicode regular expressions if the POSIX classes are unavailable.

What is a character set in regex?

A "character class", or a "character set", is a set of characters put in square brackets. The regex engine matches only one out of several characters in the character class or character set. We place the characters we want to match between square brackets. If you want to match any vowel, we use the character set [aeiou].

How do you match a regular expression in POSIX?

So in POSIX, the regular expression [d] matches a or a d. To match a ], put it as the first character after the opening [ or the negating ^. To match a -, put it right before the closing ]. To match a ^, put it before the final literal - or the closing ].


1 Answers

Since POSIX is not supported by Python re module, you have to emulate it with the help of character class.

You can use the one from the regular-expressions.info and add a limiting quantifier {10}:

[\x20-\x7E]{10}

See demo

Alternatively, you can use Matthew Barnett regex module that claims to support POSIX character classes (POSIX character classes are supported.).

like image 146
Wiktor Stribiżew Avatar answered Oct 19 '22 01:10

Wiktor Stribiżew