Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to split a string containing a numeric value into three parts in PostgreSQL?

I want to split a given string which could possibly contain a numeric value, using regexp_matches(). It should identify the first occurrence of a numeric value containing an optional sign and optional decimal places. The non matching parts should be returned as well - as first and last positions of the array.

Some example input and expected output values:

'hello+111123.454545world' -> {hello,+111123.454545,world}
'he-lo+111123.454545world' -> {he-lo,+111123.454545,world}
'hel123.5lo+111123.454545world' -> {hel,123.5,lo+111123.454545world}
'hello+111123.454545world' -> {hello,+111123.454545,world}
'hello+111123.454545world' -> {hello,+111123.454545,world}
'1111.15' -> {"",1111.15,""}
'-.234' -> {"",-.234,""}
'hello-.234' -> {hello,-.234,""}

I'm having trouble with the first part of the match group in the following expression represented by 'TODO'. It is supposed to match anything that cannot be identified as a numeric value.

select regexp_matches('input', '(TODO)((?:\+|-)?(?:\d*(?:(?:\.)?\d+)))(.*)')

The match group represented by '(TODO)' needs to be the negation of the regular expression in the second match group. (As the result is required to be returned). The regex for matching the numeric value works fine, and what I need is how to match the first part of the string which is not a numeric value.

like image 469
sanjayav Avatar asked Sep 27 '22 05:09

sanjayav


1 Answers

regexp_matches(input, '(^.*?)([+-]?\d*\.?\d+)(.*$)') AS result_arr
  • 1st match: (^.*?)
    Anchored to the start of the string with ^. The non-greedy quantifier *? is crucial.
    It actually doesn't have to be the negation of the regular expression in the second match group because the rest of the regular expression is greedy. So the first part is what remains, defined by the rest.

  • 2nd match: ([+-]?\d*?\.?\d+)
    I simplified your expression somewhat. In particular a character class [+-] is shorter and faster than two branches in non-capturing parentheses (?:\+|-).
    Non-capturing parentheses are important. (You already had that.)
    Simplified \d* after comment from @maraca.

  • 3rd match: (.*$)
    Anchored to the end of the string with $. For the last match, make the quantifier greedy.

SQL Fiddle with extended test case.

like image 124
Erwin Brandstetter Avatar answered Nov 15 '22 12:11

Erwin Brandstetter