Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance of SUBSTR on CLOB

Tags:

I have a PL/SQL procedure that does a lot of SUBSTRs on a VARCHAR2 parameter. I would like to remove the length limit, so I tried to change it to CLOB.

Works fine, but performance suffers, so I did some tests (based on these tests from 2005).


UPDATE: I can reproduce this on several different instances with different Oracle versions and different hardware, dbms_lob.substr is always noticeable slower than substr(CLOB), and a lot slower than SUBSTR(VARCHAR2).

Bob's results and the tests in the link above tell a different story.

Can anyone explain this, or at least reproduce either Bob's or my results? Thanks!


Test results:

+000000000 00:00:00.004000000 (VARCHAR2)
+000000000 00:00:00.298000000 (CLOB SUBSTR)
+000000000 00:00:00.356000000 (DBMS_LOB.SUBSTR)

Test code:

DECLARE   l_text   VARCHAR2(30) := 'This is a test record';   l_clob   CLOB := l_text;   l_substr VARCHAR2(30);   t TIMESTAMP; BEGIN   t := SYSTIMESTAMP;   FOR i IN 1..100000 LOOP     l_substr := SUBSTR(l_text,1,14);   END LOOP;   dbms_output.put_line( SYSTIMESTAMP - t || ' (VARCHAR2)');    t := SYSTIMESTAMP;   FOR i IN 1..100000 LOOP     l_substr := SUBSTR(l_clob,1,14);   END LOOP;   dbms_output.put_line( SYSTIMESTAMP - t || ' (CLOB SUBSTR)');    t := SYSTIMESTAMP;   FOR i IN 1..100000 LOOP     l_substr := DBMS_LOB.SUBSTR(l_clob,14,1);   END LOOP;   dbms_output.put_line( SYSTIMESTAMP - t || ' (DBMS_LOB.SUBSTR)'); END; 
like image 830
Peter Lang Avatar asked Apr 26 '12 10:04

Peter Lang


People also ask

How does Oracle handle CLOB data types?

In Oracle, you can use LENGTH() or DBMS_LOB. GETLENGTH() function to get the length of a CLOB column in characters. There is LENGTHB() function to get the length in bytes, but it can be used for single-byte CLOBs only, so it will return the same result as LENGTH().

What is the maximum size for CLOB column?

A CLOB (character large object) value can be up to 2,147,483,647 characters long.

What is the output of Substr?

The SUBSTR( ) function returns characters from the string value starting at the character position specified by start. The number of characters returned is specified by length.


1 Answers

(Lies, damn lies, and benchmarks...)

I re-ran your test 10 times, expanding the string so it was a full 30 characters long, and got the following averaged results:

+000000000 00:00:00.011694200 (VARCHAR2) +000000000 00:00:00.901000600 (CLOB SUBSTR) +000000000 00:00:00.013169200 (DBMS_LOB.SUBSTR) 

I then changed the substring range to 5,14 (14,5 for DBMS_LOB.SUBSTR) and got:

+000000000 00:00:00.011731000 (VARCHAR2) +000000000 00:00:01.010840000 (CLOB SUBSTR) +000000000 00:00:00.011427000 (DBMS_LOB.SUBSTR) 

I then changed the range to 17,14 (14,17 for DBMS_LOB.SUBSTR) and got

+000000000 00:00:00.013578900 (VARCHAR2) +000000000 00:00:00.964527400 (CLOB SUBSTR) +000000000 00:00:00.011416800 (DBMS_LOB.SUBSTR) 

Finally, I changed the range to 25,14 (14,25 for DBMS_LOB.SUBSTR) and got

+000000000 00:00:00.011210200 (VARCHAR2) +000000000 00:00:00.916439800 (CLOB SUBSTR) +000000000 00:00:00.013781300 (DBMS_LOB.SUBSTR) 

My conclusion is that when working against CLOB's it's best to use DBMS_LOB.SUBSTR as it appears to have effectively no performance penalty compared to using SUBSTR against a "normal" VARCHAR2. SUBSTR against a CLOB seems to suffer from a significant performance penalty. For the record - OS = HP/UX (Unix variant), Oracle version=11.1, processor=HP Itanium 2-plex. YMMV.

Share and enjoy.


And because if it's worth doing it's worth over-doing, here's some more results with the strings expanded to 32767 characters. Substring ranges given with each set of results:

1, 25000 +000000000 00:00:00.198466400 (VARCHAR2) +000000000 00:00:02.870958700 (CLOB SUBSTR) +000000000 00:00:00.174490100 (DBMS_LOB.SUBSTR)  1000, 25000 +000000000 00:00:00.253447900 (VARCHAR2) +000000000 00:00:02.491790500 (CLOB SUBSTR) +000000000 00:00:00.193560100 (DBMS_LOB.SUBSTR)  10000, 25000 +000000000 00:00:00.217812000 (VARCHAR2) +000000000 00:00:02.268794800 (CLOB SUBSTR) +000000000 00:00:00.222200200 (DBMS_LOB.SUBSTR) 

Same day, same conclusion.

Cthulhu fhtagn.


(Once more unto the breach, dear friends, once more...)

Re-ran the benchmarks, changing the size of the CLOB to 3276700, and taking the substring from the middle starting at 2475000 for length 25000 I get:

+000000000 00:00:00.176883200 (VARCHAR2) +000000000 00:00:02.069482600 (CLOB SUBSTR) +000000000 00:00:00.175341500 (DBMS_LOB.SUBSTR) 

(Note that changes only affect the last two tests).

AND...same results, different day.

YMMV.

like image 104