Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is [:graph:] equivalent to \S in regular expressions?

Tags:

regex

posix

There is a table at http://www.regular-expressions.info/posixbrackets.html that summarizes all the POSIX bracket expressions and also provides the equivalent shorthand.

I am unable to understand why this doesn't mention \S as a shorthand for [:graph:]. Are they different? If yes, then could you please explain me, with examples, how they are different?

like image 539
Lone Learner Avatar asked Sep 21 '14 14:09

Lone Learner


People also ask

What does \s match in regular expressions?

\s -- (lowercase s) matches a single whitespace character -- space, newline, return, tab, form [ \n\r\t\f]. \S (upper case S) matches any non-whitespace character. \t, \n, \r -- tab, newline, return. \d -- decimal digit [0-9] (some older regex utilities do not support \d, but they all support \w and \s)

What is D in regular expression?

In regex, the uppercase metacharacter is always the inverse of the lowercase counterpart. \d (digit) matches any single digit (same as [0-9] ). The uppercase counterpart \D (non-digit) matches any single character that is not a digit (same as [^0-9] ).

What will the $' regular expression match?

By default, regular expressions will match any part of a string. It's often useful to anchor the regular expression so that it matches from the start or end of the string: ^ matches the start of string. $ matches the end of the string.

What does the regular expression a z0 9 \-] mean?

In a regular expression, if you have [a-z] then it matches any lowercase letter. [0-9] matches any digit. So if you have [a-z0-9], then it matches any lowercase letter or digit. You can refer to the Python documentation for more information, especially in the chapter 6.2-Regular Expression operations.


1 Answers

[:graph:] is different character class from \S.

[:graph:] only match visible characters. But \S match any characters that is not space (space, newline, character return, line feed, tab, vertical tab, ..).

For example, [:graph:] does not match NUL, Backspace, BEL, ..., but \S match them.


Python example using regex package (which support POSIX character classes):

>>> import regex
>>> regex.findall(r'[[:graph:]]', 'a \0 \a \b z')
['a', 'z']
>>> regex.findall(r'\S', 'a \0 \a \b z')
['a', '\x00', '\x07', '\x08', 'z']
like image 92
falsetru Avatar answered Nov 10 '22 12:11

falsetru