Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance of SUBSTR on CLOB

Tags:

I have a PL/SQL procedure that does a lot of SUBSTRs on a VARCHAR2 parameter. I would like to remove the length limit, so I tried to change it to CLOB.

Works fine, but performance suffers, so I did some tests (based on these tests from 2005).


UPDATE: I can reproduce this on several different instances with different Oracle versions and different hardware, dbms_lob.substr is always noticeable slower than substr(CLOB), and a lot slower than SUBSTR(VARCHAR2).

Bob's results and the tests in the link above tell a different story.

Can anyone explain this, or at least reproduce either Bob's or my results? Thanks!


Test results:

+000000000 00:00:00.004000000 (VARCHAR2)
+000000000 00:00:00.298000000 (CLOB SUBSTR)
+000000000 00:00:00.356000000 (DBMS_LOB.SUBSTR)

Test code:

DECLARE   l_text   VARCHAR2(30) := 'This is a test record';   l_clob   CLOB := l_text;   l_substr VARCHAR2(30);   t TIMESTAMP; BEGIN   t := SYSTIMESTAMP;   FOR i IN 1..100000 LOOP     l_substr := SUBSTR(l_text,1,14);   END LOOP;   dbms_output.put_line( SYSTIMESTAMP - t || ' (VARCHAR2)');    t := SYSTIMESTAMP;   FOR i IN 1..100000 LOOP     l_substr := SUBSTR(l_clob,1,14);   END LOOP;   dbms_output.put_line( SYSTIMESTAMP - t || ' (CLOB SUBSTR)');    t := SYSTIMESTAMP;   FOR i IN 1..100000 LOOP     l_substr := DBMS_LOB.SUBSTR(l_clob,14,1);   END LOOP;   dbms_output.put_line( SYSTIMESTAMP - t || ' (DBMS_LOB.SUBSTR)'); END; 
like image 830
Peter Lang Avatar asked Apr 26 '12 10:04

Peter Lang


People also ask

How does Oracle handle CLOB data types?

In Oracle, you can use LENGTH() or DBMS_LOB. GETLENGTH() function to get the length of a CLOB column in characters. There is LENGTHB() function to get the length in bytes, but it can be used for single-byte CLOBs only, so it will return the same result as LENGTH().

What is the maximum size for CLOB column?

A CLOB (character large object) value can be up to 2,147,483,647 characters long.

What is the output of Substr?

The SUBSTR( ) function returns characters from the string value starting at the character position specified by start. The number of characters returned is specified by length.


1 Answers

(Lies, damn lies, and benchmarks...)

I re-ran your test 10 times, expanding the string so it was a full 30 characters long, and got the following averaged results:

+000000000 00:00:00.011694200 (VARCHAR2) +000000000 00:00:00.901000600 (CLOB SUBSTR) +000000000 00:00:00.013169200 (DBMS_LOB.SUBSTR) 

I then changed the substring range to 5,14 (14,5 for DBMS_LOB.SUBSTR) and got:

+000000000 00:00:00.011731000 (VARCHAR2) +000000000 00:00:01.010840000 (CLOB SUBSTR) +000000000 00:00:00.011427000 (DBMS_LOB.SUBSTR) 

I then changed the range to 17,14 (14,17 for DBMS_LOB.SUBSTR) and got

+000000000 00:00:00.013578900 (VARCHAR2) +000000000 00:00:00.964527400 (CLOB SUBSTR) +000000000 00:00:00.011416800 (DBMS_LOB.SUBSTR) 

Finally, I changed the range to 25,14 (14,25 for DBMS_LOB.SUBSTR) and got

+000000000 00:00:00.011210200 (VARCHAR2) +000000000 00:00:00.916439800 (CLOB SUBSTR) +000000000 00:00:00.013781300 (DBMS_LOB.SUBSTR) 

My conclusion is that when working against CLOB's it's best to use DBMS_LOB.SUBSTR as it appears to have effectively no performance penalty compared to using SUBSTR against a "normal" VARCHAR2. SUBSTR against a CLOB seems to suffer from a significant performance penalty. For the record - OS = HP/UX (Unix variant), Oracle version=11.1, processor=HP Itanium 2-plex. YMMV.

Share and enjoy.


And because if it's worth doing it's worth over-doing, here's some more results with the strings expanded to 32767 characters. Substring ranges given with each set of results:

1, 25000 +000000000 00:00:00.198466400 (VARCHAR2) +000000000 00:00:02.870958700 (CLOB SUBSTR) +000000000 00:00:00.174490100 (DBMS_LOB.SUBSTR)  1000, 25000 +000000000 00:00:00.253447900 (VARCHAR2) +000000000 00:00:02.491790500 (CLOB SUBSTR) +000000000 00:00:00.193560100 (DBMS_LOB.SUBSTR)  10000, 25000 +000000000 00:00:00.217812000 (VARCHAR2) +000000000 00:00:02.268794800 (CLOB SUBSTR) +000000000 00:00:00.222200200 (DBMS_LOB.SUBSTR) 

Same day, same conclusion.

Cthulhu fhtagn.


(Once more unto the breach, dear friends, once more...)

Re-ran the benchmarks, changing the size of the CLOB to 3276700, and taking the substring from the middle starting at 2475000 for length 25000 I get:

+000000000 00:00:00.176883200 (VARCHAR2) +000000000 00:00:02.069482600 (CLOB SUBSTR) +000000000 00:00:00.175341500 (DBMS_LOB.SUBSTR) 

(Note that changes only affect the last two tests).

AND...same results, different day.

YMMV.

like image 104


Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!