Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PostgreSQL Regex Word Boundaries?

Does PostgreSQL support \b?

I'm trying \bAB\b but it doesn't match anything, whereas (\W|^)AB(\W|$) does. These 2 expressions are essentially the same, aren't they?

like image 977
mpen Avatar asked Sep 29 '10 20:09

mpen


People also ask

What are word boundaries in regex?

A word boundary, in most regex dialects, is a position between \w and \W (non-word char), or at the beginning or end of a string if it begins or ends (respectively) with a word character ( [0-9A-Za-z_] ). So, in the string "-12" , it would match before the 1 or after the 2.

How do I remove special characters from a string in PostgreSQL?

Postgresql regexp_replace special charactersSELECT regexp_replace('[email protected]','[^\w]+',''); In the above code, the source is '[email protected]' with the special character @, the pattern is '[^\w]+', which means replacing everything that is not number, digit, underline with the nothing.

How do you check if a string contains a substring PostgreSQL?

Use the substring() Function to SELECT if String Contains a Substring Match in PostgreSQL. The substring() returns the strings similar to abc in our case or contains abc . We then match the returned results to the str using the ~~ operator, short for like , and if they match, we select the results from the table.

What does ~* mean in PostgreSQL?

The tilde operator returns true or false depending on whether or not a regular expression can match a string or a part thereof. ~ (Matches regular expression, case sensitive) ~* (Matches regular expression, case insensitive)


1 Answers

PostgreSQL uses \m, \M, \y and \Y as word boundaries:

\m   matches only at the beginning of a word \M   matches only at the end of a word \y   matches only at the beginning or end of a word \Y   matches only at a point that is not the beginning or end of a word  

See Regular Expression Constraint Escapes in the manual.

There is also [[:<:]] and [[:>:]], which match the beginning and end of a word. From the manual:

There are two special cases of bracket expressions: the bracket expressions [[:<:]] and [[:>:]] are constraints, matching empty strings at the beginning and end of a word respectively. A word is defined as a sequence of word characters that is neither preceded nor followed by word characters. A word character is an alnum character (as defined by ctype) or an underscore. This is an extension, compatible with but not specified by POSIX 1003.2, and should be used with caution in software intended to be portable to other systems. The constraint escapes described below are usually preferable (they are no more standard, but are certainly easier to type).

like image 90
Daniel Vandersluis Avatar answered Sep 22 '22 22:09

Daniel Vandersluis