Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract numbers from a field in PostgreSQL

I have a table with a column po_number of type varchar in Postgres 8.4. It stores alphanumeric values with some special characters. I want to ignore the characters [/alpha/?/$/encoding/.] and check if the column contains a number or not. If its a number then it needs to typecast as number or else pass null, as my output field po_number_new is a number field.

Below is the example:

example

SQL Fiddle.

I tired this statement:

select 
(case when  regexp_replace(po_number,'[^\w],.-+\?/','') then po_number::numeric
else null
end) as po_number_new from test

But I got an error for explicit cast:

error

like image 362
user1538020 Avatar asked Nov 12 '16 15:11

user1538020


3 Answers

Simply:

SELECT NULLIF(regexp_replace(po_number, '\D','','g'), '')::numeric AS result
FROM   tbl;

\D being the class shorthand for "not a digit".
And you need the 4th parameter 'g' (for "globally") to replace all occurrences.
Details in the manual.

For a known, limited set of characters to replace, plain string manipulation functions like replace() or translate() are substantially cheaper. Regular expressions are just more versatile, and we want to eliminate everything but digits in this case. Related:

  • Regex remove all occurrences of multiple characters in a string
  • PostgreSQL SELECT only alpha characters on a row
  • Is there a regexp_replace equivalent for postgresql 7.4?

But why Postgres 8.4? Consider upgrading to a modern version.

Consider pitfalls for outdated versions:

  • Order varchar string as numeric
  • WARNING: nonstandard use of escape in a string literal
like image 161
Erwin Brandstetter Avatar answered Nov 08 '22 09:11

Erwin Brandstetter


I think you want something like this:

select (case when regexp_replace(po_number, '[^\w],.-+\?/', '') ~ '^[0-9]+$'
             then regexp_replace(po_number, '[^\w],.-+\?/', '')::numeric
        end) as po_number_new 
from test;

That is, you need to do the conversion on the string after replacement.

Note: This assumes that the "number" is just a string of digits.

like image 3
Gordon Linoff Avatar answered Nov 08 '22 09:11

Gordon Linoff


The logic I would use to determine if the po_number field contains numeric digits is that its length should decrease when attempting to remove numeric digits.

If so, then all non numeric digits ([^\d]) should be removed from the po_number column. Otherwise, NULL should be returned.

select case when char_length(regexp_replace(po_number, '\d', '', 'g')) < char_length(po_number)
            then regexp_replace(po_number, '[^0-9]', '', 'g')
            else null
       end as po_number_new
from test
like image 2
Tim Biegeleisen Avatar answered Nov 08 '22 10:11

Tim Biegeleisen