I have values being returned with 255 comma separated values. Is there an easy way to split those into columns without having 255 substr?
ROW | VAL
-----------
1 | 1.25, 3.87, 2, ...
2 | 5, 4, 3.3, ....
to
ROW | VAL | VAL | VAL ...
---------------------
1 |1.25 |3.87 | 2 ...
2 | 5 | 4 | 3.3 ...
B) Using STRING_SPLIT() function to split a comma-separated string in a column. Sometimes, database tables are not normalized. A typical example of this is when a column can store multiple values separated by a comma (,). The STRING_SPLIT() can help normalize the data by splitting these multi-valued columns.
1- Example -- ORACLE > 9. x Select Regexp_Substr('KING,JONES,FORD' ,'[^,]+' ,1 ,Level) Emp_Name From Dual Connect By Regexp_Substr('KING,JONES,FORD' ,'[^,]+' ,1 ,Level) Is Not Null; Find the employees named in a String separated by commas. Select * From Employee Emp Where Emp.
Beware! The regexp_substr expression of the format '[^,]+'
will not return the expected value if there is a null element in the list and you want that item or one after it. Consider this example where the 4th element is NULL and I want the 5th element and thus expect the '5' to be returned:
SQL> select regexp_substr('1,2,3,,5,6', '[^,]+', 1, 5) from dual;
R
-
6
Surprise! It returns the 5th NON-NULL element, not the actual 5th element! Incorrect data returned and you may not even catch it. Try this instead:
SQL> select regexp_substr('1,2,3,,5,6', '(.*?)(,|$)', 1, 5, NULL, 1) from dual;
R
-
5
So, the above corrected REGEXP_SUBSTR says to look for the 5th occurrence of 0 or more comma-delimited characters followed by a comma or the end of the line (allows for the next separator, be it a comma or the end of the line) and when found return the 1st subgroup (the data NOT including the comma or end of the line).
The search match pattern '(.*?)(,|$)'
explained:
( = Start a group
. = match any character
* = 0 or more matches of the preceding character
? = Match 0 or 1 occurrences of the preceding pattern
) = End the 1st group
( = Start a new group (also used for logical OR)
, = comma
| = OR
$ = End of the line
) = End the 2nd group
EDIT: More info added and simplified the regex.
See this post for more info and a suggestion to encapsulate this in a function for easy reuse: REGEX to select nth value from a list, allowing for nulls
It's the post where I discovered the format '[^,]+'
has the problem. Unfortunately it's the regex format you will most commonly see as the answer for questions regarding how to parse a list. I shudder to think of all the incorrect data being returned by '[^,]+'
!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With