Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PostgreSQL, regex to match text fields with numeric values

I have a column of type text, that contains random values. Some of them are numbers, some other texts, and some other mixed text and numbers. I'm trying to discriminate them as:

myfield   is_numeric
____________________
-4        true
0004      true
4.00      true
dog       false
D04       false
04f       false

Whereas others are names and other strings. I was using the regular expression

SELECT id,
       myfield 
       (myfield::varchar~ '^-?[0-9]*.?[0-9]*$') is_numeric 
FROM mytable

To tell wether the row contains a valid number or not. However, I noticed that values like D04 and 04c return true for that regex, which, for my use case, is a false positive.

Why is this happening? It seems that ^ is matching not necessarily the whole value, but any valid substring of the value. However, values like D04f do return false, so even if there's a numeric substring in the field, the combination of ^ and $ operators is doing its job.

I have temporarily resorted to using:

SELECT id, 
       myfield
       (myfield::varchar ~ '^-?[0-9]*.?[0-9]*$' 
       AND myfield::varchar !~ '[^0-9\-\.]') is_numeric 
FROM mytable

But this seem inneficient (and does not rule out double dots), and I'm still wondering why the regex is correctly ruling out strings that start and end with a non-numeric character, while incorrecly returning true for string that contain only a trailing or leading non numeric character.

like image 792
ffflabs Avatar asked Feb 13 '17 16:02

ffflabs


1 Answers

Would this work for you?

^-?[0-9]+\.?[0-9]*$

I'm asuming -0.07.5 is invalid (double dot present).

D04 will return false as well.

The problem in your original regex is that you're not escaping the dot, so it will match any characher, including D in your D04.

Hope it helps.

like image 98
Eduardo Escobar Avatar answered Oct 14 '22 08:10

Eduardo Escobar