I am starting with Postgre Regular Expressions, working on PostgreSQL 8.3 version.
I have a table of records as following:
record
-----------
a
ab
abc
123abc
132abc
123ABC
abc123
4567
5678
6789
I was going through a simple tutorial: www.oreillynet.com. Everything seemed clear to me until I came to the following query:
SELECT record FROM myrecords WHERE record ~ '[^0-9]';
The tutorail says:
this expression returns strings that don't include digits.
But it returned following output to me:
output
------
a
ab
abc
123abc
132abc
123ABC
abc123
Can anyone explain me this behaviour, please? I am then struggling with another expression on that site trying to exclude strings that include digits AND lower-case letters in the same string.
<> is the standard SQL operator meaning "not equal". Many databases, including postgresql, supports != as a synonym for <> . They're exactly the same in postgresql.
The simplest use of regex in PostgreSQL is the ~ operator, and its cousin the ~* operator. value ~ regex tests the value on the left against the regex on the right and returns true if the regex can match within the value. Note that the regex does not have to fully match the whole value, it just has to match a part.
The keyword ILIKE can be used instead of LIKE to make the match case insensitive according to the active locale. This is not in the SQL standard but is a PostgreSQL extension. The operator ~~ is equivalent to LIKE , and ~~* corresponds to ILIKE .
This command:
SELECT record FROM myrecords WHERE record ~ '[^0-9]';
means that in the record
field there should be at least one non-digit character (this is the meaning of the regex).
If one looks for the records which would include digits and lower-case letter, then I would expect a regex like:
SELECT record FROM myrecords WHERE record ~ '[0-9a-z]';
which would return all the records having at least one character which is a digit or lowercase letter.
If you want to get the records which have no digits, then you would have to use the following regex:
SELECT record FROM myrecords WHERE record ~ '^[^0-9]+$';
Here, the ^
character outside of square brackets means the beginning of the field, the $
character means the end of the field, and we require that all characters in between are non-digits. +
indicates that there should be at least one such characters. If we would also allow empty strings, then the regex would look like ^[^0-9]*$
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With