Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

isnumeric() with PostgreSQL

I need to determine whether a given string can be interpreted as a number (integer or floating point) in an SQL statement. As in the following:

SELECT AVG(CASE WHEN x ~ '^[0-9]*.?[0-9]*$' THEN x::float ELSE NULL END) FROM test 

I found that Postgres' pattern matching could be used for this. And so I adapted the statement given in this place to incorporate floating point numbers. This is my code:

WITH test(x) AS (     VALUES (''), ('.'), ('.0'), ('0.'), ('0'), ('1'), ('123'),     ('123.456'), ('abc'), ('1..2'), ('1.2.3.4'))  SELECT x      , x ~ '^[0-9]*.?[0-9]*$' AS isnumeric FROM test; 

The output:

    x    | isnumeric  ---------+-----------          | t  .       | t  .0      | t  0.      | t  0       | t  1       | t  123     | t  123.456 | t  abc     | f  1..2    | f  1.2.3.4 | f (11 rows) 

As you can see, the first two items (the empty string '' and the sole period '.') are misclassified as being a numeric type (which they are not). I can't get any closer to this at the moment. Any help appreciated!


Update Based on this answer (and its comments), I adapted the pattern to:

WITH test(x) AS (     VALUES (''), ('.'), ('.0'), ('0.'), ('0'), ('1'), ('123'),     ('123.456'), ('abc'), ('1..2'), ('1.2.3.4'), ('1x234'), ('1.234e-5'))  SELECT x      , x ~ '^([0-9]+[.]?[0-9]*|[.][0-9]+)$' AS isnumeric FROM test; 

Which gives:

     x    | isnumeric  ----------+-----------           | f  .        | f  .0       | t  0.       | t  0        | t  1        | t  123      | t  123.456  | t  abc      | f  1..2     | f  1.2.3.4  | f  1x234    | f  1.234e-5 | f (13 rows) 

There are still some issues with the scientific notation and with negative numbers, as I see now.

like image 380
moooeeeep Avatar asked Apr 24 '13 15:04

moooeeeep


People also ask

Is Numeric in Postgres?

PostgreSQL supports the NUMERIC type for storing numbers with a very large number of digits. Generally NUMERIC type are used for the monetary or amounts storage where precision is required. Syntax: NUMERIC(precision, scale) Where, Precision: Total number of digits.

How do you use IsNumeric function?

VBA example This example uses the IsNumeric function to determine if a variable can be evaluated as a number. MyVar = "53" ' Assign value. MyCheck = IsNumeric(MyVar) ' Returns True. MyVar = "459.95" ' Assign value.

Is serial integer in PostgreSQL?

PostgreSQL has a special kind of database object generator called SERIAL. It is used to generate a sequence of integers which are often used as the Primary key of a table. When creating a table, this sequence of integers can be created as follows: CREATE TABLE table_name( id SERIAL );

IS NULL function PostgreSQL?

If the expression is NULL, then the ISNULL function returns the replacement . Otherwise, it returns the result of the expression . For the COALESCE example, check it out the COALESCE function tutorial.


1 Answers

As you may noticed, regex-based method is almost impossible to do correctly. For example, your test says that 1.234e-5 is not valid number, when it really is. Also, you missed negative numbers. What if something looks like a number, but when you try to store it it will cause overflow?

Instead, I would recommend to create function that tries to actually cast to NUMERIC (or FLOAT if your task requires it) and returns TRUE or FALSE depending on whether this cast was successful or not.

This code will fully simulate function ISNUMERIC():

CREATE OR REPLACE FUNCTION isnumeric(text) RETURNS BOOLEAN AS $$ DECLARE x NUMERIC; BEGIN     x = $1::NUMERIC;     RETURN TRUE; EXCEPTION WHEN others THEN     RETURN FALSE; END; $$ STRICT LANGUAGE plpgsql IMMUTABLE; 

Calling this function on your data gets following results:

WITH test(x) AS ( VALUES (''), ('.'), ('.0'), ('0.'), ('0'), ('1'), ('123'),   ('123.456'), ('abc'), ('1..2'), ('1.2.3.4'), ('1x234'), ('1.234e-5')) SELECT x, isnumeric(x) FROM test;      x     | isnumeric ----------+-----------           | f  .        | f  .0       | t  0.       | t  0        | t  1        | t  123      | t  123.456  | t  abc      | f  1..2     | f  1.2.3.4  | f  1x234    | f  1.234e-5 | t  (13 rows) 

Not only it is more correct and easier to read, it will also work faster if data was actually a number.

like image 174
mvp Avatar answered Sep 24 '22 11:09

mvp