I want to be able to split csv strings in Oracle 9i
I've read the following article http://www.oappssurd.com/2009/03/string-split-in-oracle.html
But I didn't understand how to make this work. Here are some of my questions pertaining to it
Description This is a small pipelined table function that gets one string that includes a delimited list of values, and returns these values as a table.
A delimiter-separated string can be converted to a set of rows in Oracle SQL, with the combination of the regex function REGEX_SUBSTR and recursion via CONNECT BY. This feature can be used for splitting a single input string with comma-separated query parameters, into a list of values.
You need to combine two nested statement: Select * From Employee Emp Where Emp. Emp_Name In (Select Regexp_Substr('KING,JONES,FORD' ,'[^,]+' ,1 ,Level) Emp_Name From Dual Connect By Regexp_Substr('KING,JONES,FORD' ,'[^,]+' ,1 ,Level) Is Not Null);
Joyce,
Here are three examples:
1) Using dbms_utility.comma_to_table. This is not a general purpose routine, because the elements should be valid identifiers. With some dirty tricks we can make it work more universal:
SQL> declare
2 cn_non_occuring_prefix constant varchar2(4) := 'zzzz';
3 mystring varchar2(2000):='a:sd:dfg:31456:dasd: :sdfsdf'; -- just an example
4 l_tablen binary_integer;
5 l_tab dbms_utility.uncl_array;
6 begin
7 dbms_utility.comma_to_table
8 ( list => cn_non_occuring_prefix || replace(mystring,':',','||cn_non_occuring_prefix)
9 , tablen => l_tablen
10 , tab => l_tab
11 );
12 for i in 1..l_tablen
13 loop
14 dbms_output.put_line(substr(l_tab(i),1+length(cn_non_occuring_prefix)));
15 end loop;
16 end;
17 /
a
sd
dfg
31456
dasd
sdfsdf
PL/SQL-procedure is geslaagd.
2) Using SQL's connect by level. If you are on 10g or higher you can use the connect-by-level approach in combination with regular expressions, like this:
SQL> declare
2 mystring varchar2(2000):='a:sd:dfg:31456:dasd: :sdfsdf'; -- just an example
3 begin
4 for r in
5 ( select regexp_substr(mystring,'[^:]+',1,level) element
6 from dual
7 connect by level <= length(regexp_replace(mystring,'[^:]+')) + 1
8 )
9 loop
10 dbms_output.put_line(r.element);
11 end loop;
12 end;
13 /
a
sd
dfg
31456
dasd
sdfsdf
PL/SQL-procedure is geslaagd.
3) Again using SQL's connect by level, but now in combination with good old SUBSTR/INSTR in case you are on version 9, like you are:
SQL> declare
2 mystring varchar2(2000):='a:sd:dfg:31456:dasd: :sdfsdf'; -- just an example
3 begin
4 for r in
5 ( select substr
6 ( str
7 , instr(str,':',1,level) + 1
8 , instr(str,':',1,level+1) - instr(str,':',1,level) - 1
9 ) element
10 from (select ':' || mystring || ':' str from dual)
11 connect by level <= length(str) - length(replace(str,':')) - 1
12 )
13 loop
14 dbms_output.put_line(r.element);
15 end loop;
16 end;
17 /
a
sd
dfg
31456
dasd
sdfsdf
PL/SQL-procedure is geslaagd.
You can see some more techniques like these, in this blogpost: http://rwijk.blogspot.com/2007/11/interval-based-row-generation.html
Hope this helps.
Regards, Rob.
To address your comment:
An example of inserting the separated values into a normalized table.
First create the tables:
SQL> create table csv_table (col)
2 as
3 select 'a,sd,dfg,31456,dasd,,sdfsdf' from dual union all
4 select 'a,bb,ccc,dddd' from dual union all
5 select 'zz,yy,' from dual
6 /
Table created.
SQL> create table normalized_table (value varchar2(10))
2 /
Table created.
Because you seem interested in the dbms_utility.comma_to_table approach, I mention it here. However, I certainly do not recommend this variant, because of the identifier quirks and because of the slow row by row processing.
SQL> declare
2 cn_non_occuring_prefix constant varchar2(4) := 'zzzz';
3 l_tablen binary_integer;
4 l_tab dbms_utility.uncl_array;
5 begin
6 for r in (select col from csv_table)
7 loop
8 dbms_utility.comma_to_table
9 ( list => cn_non_occuring_prefix || replace(r.col,',',','||cn_non_occuring_prefix)
10 , tablen => l_tablen
11 , tab => l_tab
12 );
13 forall i in 1..l_tablen
14 insert into normalized_table (value)
15 values (substr(l_tab(i),length(cn_non_occuring_prefix)+1))
16 ;
17 end loop;
18 end;
19 /
PL/SQL procedure successfully completed.
SQL> select * from normalized_table
2 /
VALUE
----------
a
sd
dfg
31456
dasd
sdfsdf
a
bb
ccc
dddd
zz
yy
14 rows selected.
I do recommend this single SQL variant:
SQL> truncate table normalized_table
2 /
Table truncated.
SQL> insert into normalized_table (value)
2 select substr
3 ( col
4 , instr(col,',',1,l) + 1
5 , instr(col,',',1,l+1) - instr(col,',',1,l) - 1
6 )
7 from ( select ',' || col || ',' col from csv_table )
8 , ( select level l from dual connect by level <= 100 )
9 where l <= length(col) - length(replace(col,',')) - 1
10 /
14 rows created.
SQL> select * from normalized_table
2 /
VALUE
----------
a
a
zz
sd
bb
yy
dfg
ccc
31456
dddd
dasd
sdfsdf
14 rows selected.
Regards, Rob.
Here's a string tokenizer for Oracle that's a little more straightforward than that page, but no idea if it's as fast:
create or replace function splitter_count(str in varchar2, delim in char) return int as
val int;
begin
val := length(replace(str, delim, delim || ' '));
return val - length(str);
end;
create type token_list is varray(100) of varchar2(200);
CREATE or replace function tokenize (str varchar2, delim char) return token_list as
ret token_list;
target int;
i int;
this_delim int;
last_delim int;
BEGIN
ret := token_list();
i := 1;
last_delim := 0;
target := splitter_count(str, delim);
while i <= target
loop
ret.extend();
this_delim := instr(str, delim, 1, i);
ret(i):= substr(str, last_delim + 1, this_delim - last_delim -1);
i := i + 1;
last_delim := this_delim;
end loop;
ret.extend();
ret(i):= substr(str, last_delim + 1);
return ret;
end;
You can use it like this:
select tokenize('hi you person', ' ') from dual;
VARCHAR(hi,you,person)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With