Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

REGEX Extract Amount Without Currency

Tags:

regex

sql

presto

SELECT
    ocr_text,
    bucket,
    REGEXP_EXTRACT('-?[0-9]+(\.[0-9]+)?', ocr_text)
FROM temp

I am trying to extract amounts from a string that will not have currency present. Any number that does not have decimals should not match. Commas should be allowed assuming they follow the correct rules (at hundreds marker)

56         no       (missing decimals)
56.45      yes  
120        no       (missing decimals)
120.00     yes
1200.00    yes
1,200.00   yes
1,200      no       (missing decimals)
1200       no       (missing decimals)
134.5      no       (decimal not followed by 2 digits)
23,00.00   no       (invalid comma location)

I'm a noob to REGEX so I know my above statement already does not meet the criteria i've listed. However, i'm already stuck getting the error (INVALID_FUNCTION_ARGUMENT) premature end of char-class on my REGEX_EXTRACT line


Can someone point me in the right direction? How can I resolve my current issue? How can I modify to correctly incorporate the other criteria listed?

like image 401
urdearboy Avatar asked Dec 05 '25 08:12

urdearboy


1 Answers

Here is a general regex pattern for a positive/negative number with two decimal places and optional thousands comma separators:

(?<!\S)(?:-?[0-9]{1,3}(,[0-9]{3})*(\.[0-9]{2})|-?[0-9]+(\.[0-9]{2}))(?!\S)

Demo

Your updated query:

SELECT
    ocr_text,
    bucket,
    REGEXP_EXTRACT(ocr_text, '(?<!\S)(?:-?[0-9]{1,3}(,[0-9]{3})*(\.[0-9]{2})|-?[0-9]+(\.[0-9]{2}))(?!\S)')
FROM temp;

From the Presto docs I read, it supposedly supports Java's regex syntax. In the event that lookarounds are not working, you may try this version:

SELECT
    ocr_text,
    bucket,
    REGEXP_EXTRACT(ocr_text, '(\s|^)(?:-?[0-9]{1,3}(,[0-9]{3})*(\.[0-9]{2})|-?[0-9]+(\.[0-9]{2}))(\s|$)')
FROM temp;
like image 190
Tim Biegeleisen Avatar answered Dec 07 '25 23:12

Tim Biegeleisen



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!