Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Oracle REGEXP_LIKE doesn't work as expected

Tags:

regex

sql

oracle

I was testing a regular expression in Oracle SQL and found something I could not understand:

-- NO MATCH
SELECT 1 FROM DUAL WHERE REGEXP_LIKE ('Professor Frank', '(^|\s)Prof[^\s]*(\s|$)');

Above doesn't match, while the following matches:

-- MATCH
SELECT 1 FROM DUAL WHERE REGEXP_LIKE ('Professor Frank', '(^|\s)Prof\S*(\s|$)'); 

In other regex flavors, It will be like \bProf[^\s]*\b versus \bProf\S*\b and have similar results. Note: Oracle SQL regex does not have \b or word boundary.

Question: Why don't [^\s]* and \S* work the same way in Oracle SQL?

I notice if I remove the (\s|$) at the end, the first regex will match.

like image 505
qdo Avatar asked Oct 06 '16 20:10

qdo


1 Answers

In Oracle regular expressions, \s is indeed the escape sequence for a space, but NOT in a matching character set (that is, [.....], or [^....] for excluding one character). In a matching character set, only two characters have a special meaning, - for ranges and ] for closing the set enumeration. They can't be escaped; if needed in the matching set, ] must always be the first character right after the opening [ (it is the ONLY position in which a closing ] stands for itself as a character, and does not denote the end of the matching set), and - must be first or last (best to leave it always to the end of the matching set) - anywhere else it is seen as a range marker. To include (or exclude, if using the [^.....] syntax) a space, just type an actual physical space in the matching set.

Edit: What I said above is not entirely right. There is another special character in a matching set, namely ^. If it is used in the first position, it means "match any character OTHER THAN." In any other position it stands for itself. For example, '[^^]' will match any single character OTHER THAN ^ (the first ^ has special meaning, the second stands in for itself). And, a closing bracket ] stands for itself if it is the second character in brackets, if the first character is ^ (with its SPECIAL meaning). That is, to match any single character OTHER THAN ], we can use the matching pattern '[^]]'.

like image 166
mathguy Avatar answered Sep 27 '22 23:09

mathguy