Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgres regex: Behavior of \s and \S and character class seems wrong

The documentation says that \s is whitespace and \S is not whitespace. So far, nothing new to regex users.

But let's check some return values:

SELECT SUBSTRING('abc a c' FROM 'a\\sc');
'a c'

SELECT SUBSTRING('abc a c' FROM 'a[\\s]c'); -- Note the character class
'a c'

SELECT SUBSTRING('abc a c' FROM 'a\\Sc');
'abc'

SELECT SUBSTRING('abc a c' FROM 'a[\\S]c'); -- Note the character class
ERROR:  invalid regular expression: invalid escape \ sequence

So it seems, \s can be used in a character class and \S cannot. Why?

like image 648
Leif Avatar asked Sep 23 '11 07:09

Leif


People also ask

What does \s match regex?

The \s metacharacter matches whitespace character. Whitespace characters can be: A space character.

What do means by \D \W and \S shorthand character classes signify in regular expressions in Python?

Character Classes (a.k.a. Special Sequences)\w | Matches alphanumeric characters, which means a-z , A-Z , and 0-9 . It also matches the underscore, _ . \d | Matches digits, which means 0-9 . \D | Matches any non-digits.

Does PostgreSQL support regex?

PostgreSQL employs Regular Expressions to get around pattern matching. In this article, we will learn about PostgreSQL Regex. PostgreSQL uses POSIX or “Portable Operating System Interface for Unix” regular expressions, which are better than LIKE and SIMILAR TO operators used for pattern matching.

Does \s match tab?

\s : matches any whitespace. This includes tabs, newlines, form feeds, and any character in the Unicode Z Category (which includes a variety of space characters and other separators.). The complement, \S , matches any non-whitespace character.


1 Answers

From the manual:

Within bracket expressions, \d, \s, and \w lose their outer brackets, and \D, \S, and \W are illegal.

In any case, the brackets seem redundant since \s and \S themselves are character classes.

The following syntax works for me as an alternative to a[\\S]c:

SELECT SUBSTRING('abc a c' FROM 'a[^[:space:]]c');
like image 186
NPE Avatar answered Oct 11 '22 17:10

NPE