Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PostgreSQL Regular Expression - Excluding strings with numbers

I am starting with Postgre Regular Expressions, working on PostgreSQL 8.3 version.

I have a table of records as following:

record
-----------
a  
ab
abc  
123abc 
132abc
123ABC  
abc123
4567  
5678
6789  

I was going through a simple tutorial: www.oreillynet.com. Everything seemed clear to me until I came to the following query:

SELECT record FROM myrecords WHERE record ~ '[^0-9]';

The tutorail says:

this expression returns strings that don't include digits.

But it returned following output to me:

output
------
a
ab
abc
123abc
132abc
123ABC
abc123

Can anyone explain me this behaviour, please? I am then struggling with another expression on that site trying to exclude strings that include digits AND lower-case letters in the same string.

like image 434
Trocader Avatar asked Mar 16 '14 12:03

Trocader


People also ask

What does <> mean in PostgreSQL?

<> is the standard SQL operator meaning "not equal". Many databases, including postgresql, supports != as a synonym for <> . They're exactly the same in postgresql.

Can you use regex in PostgreSQL?

The simplest use of regex in PostgreSQL is the ~ operator, and its cousin the ~* operator. value ~ regex tests the value on the left against the regex on the right and returns true if the regex can match within the value. Note that the regex does not have to fully match the whole value, it just has to match a part.

What is the difference between like and Ilike?

The keyword ILIKE can be used instead of LIKE to make the match case insensitive according to the active locale. This is not in the SQL standard but is a PostgreSQL extension. The operator ~~ is equivalent to LIKE , and ~~* corresponds to ILIKE .


1 Answers

This command:

SELECT record FROM myrecords WHERE record ~ '[^0-9]';

means that in the record field there should be at least one non-digit character (this is the meaning of the regex).

If one looks for the records which would include digits and lower-case letter, then I would expect a regex like:

SELECT record FROM myrecords WHERE record ~ '[0-9a-z]';

which would return all the records having at least one character which is a digit or lowercase letter.

If you want to get the records which have no digits, then you would have to use the following regex:

SELECT record FROM myrecords WHERE record ~ '^[^0-9]+$';

Here, the ^ character outside of square brackets means the beginning of the field, the $ character means the end of the field, and we require that all characters in between are non-digits. + indicates that there should be at least one such characters. If we would also allow empty strings, then the regex would look like ^[^0-9]*$.

like image 138
Ashalynd Avatar answered Oct 08 '22 06:10

Ashalynd