According to the documents I've read, the default storage for a CLOB or BLOB is inline, which means that if it is less than approx 4k in size then it will be held in the table.
But when I test this on a dummy table in Oracle (10.2.0.1.0) the performance and response from Oracle Monitor (by Allround Automations) suggest that it is being held outwith the table.
Here's my test scenario ...
create table clobtest ( x int primary key, y clob, z varchar(100) )
;
insert into clobtest
select object_id, object_name, object_name
from all_objects where rownum < 10001
;
select COLUMN_NAME, IN_ROW
from user_lobs
where table_name = 'CLOBTEST'
;
This shows: Y YES (suggesting that Oracle will store the clob in the row)
select x, y from CLOBTEST where ROWNUM < 1001 -- 8.49 seconds
select x, z from CLOBTEST where ROWNUM < 1001 -- 0.298 seconds
So in this case, the CLOB values will have a maximum length of 30 characters, so should always be inline. If I run Oracle Monitor, it shows a LOB.Length followed by a LOB.Read() for each row returned, again suggesting that the clob values are held outwith the table.
I also tried creating the table like this
create table clobtest
( x int primary key, y clob, z varchar(100) )
LOB (y) STORE AS (ENABLE STORAGE IN ROW)
but got exactly the same results.
Does anyone have any suggestions how I can force (persuade, encourage) Oracle to store the clob value in-line in the table? (I'm hoping to achieve similar response times to reading the varchar2 column z)
UPDATE: If I run this SQL
select COLUMN_NAME, IN_ROW, l.SEGMENT_NAME, SEGMENT_TYPE, BYTES, BLOCKS, EXTENTS
from user_lobs l
JOIN USER_SEGMENTS s
on (l.Segment_Name = s. segment_name )
where table_name = 'CLOBTEST'
then I get the following results ...
Y YES SYS_LOB0000398621C00002$$ LOBSEGMENT 65536 8 1
The behavior of Oracle LOBs is the following.
A LOB is stored inline when:
(
The size is lower or equal than 3964
AND
ENABLE STORAGE IN ROW has been defined in the LOB storage clause
) OR (
The value is NULL
)
A LOB is stored out-of-row when:
(
The value is not NULL
) AND (
Its size is higher than 3964
OR
DISABLE STORAGE IN ROW has been defined in the LOB storage clause
)
Now this is not the only issue which may impact performance.
If the LOBs are finally not stored inline, the default behavior of Oracle is to avoid caching them (only inline LOBs are cached in the buffer cache with the other fields of the row). To tell Oracle to also cache non inlined LOBs, the CACHE option should be used when the LOB is defined.
The default behavior is ENABLE STORAGE IN ROW, and NOCACHE, which means small LOBs will be inlined, large LOBs will not (and will not be cached).
Finally, there is also a performance issue at the communication protocol level. Typical Oracle clients will perform 2 additional roundtrips per LOBs to fetch them: - one to retrieve the size of the LOB and allocate memory accordingly - one to fetch the data itself (provided the LOB is small)
These extra roundtrips are performed even if an array interface is used to retrieve the results. If you retrieve 1000 rows and your array size is large enough, you will pay for 1 roundtrip to retrieve the rows, and 2000 roundtrips to retrieve the content of the LOBs.
Please note it does not depend on the fact the LOB is stored inline or not. They are complete different problems.
To optimize at the protocol level, Oracle has provided a new OCI verb to fetch several LOBs in one roundtrips (OCILobArrayRead). I don't know if something similar exists with JDBC.
Another option is to bind the LOB on client side as if it was a big RAW/VARCHAR2. This only works if a maximum size of the LOB can be defined (since the maximum size must be provided at bind time). This trick avoids the extra rountrips: the LOBs are just processed like RAW or VARCHAR2. We use it a lot in our LOB intensive applications.
Once the number of roundtrips have been optimized, the packet size (SDU) can be resized in the net configuration to better fit the situation (i.e. a limited number of large roundtrips). It tends to reduce the "SQL*Net more data to client" and "SQL*Net more data from client" wait events.
If you're "hoping to achieve similar response times to reading the varchar2 column z", then you'll be disappointed in most cases. If you're using a CLOB I suppose you need to store more than 4,000 bytes, right? Then if you need to read more bytes that's going to take longer.
BUT if you have a case where yes, you use a CLOB, but you're interested (in some instances) only in the first 4,000 bytes of the column (or less), then you have a chance of getting similar performance. It looks like Oracle can optimize the retrieval if you use something like DBMS_LOB.SUBSTR and ENABLE STORAGE IN ROW CACHE clause with your table. Example:
CREATE TABLE clobtest (x INT PRIMARY KEY, y CLOB)
LOB (y) STORE AS (ENABLE STORAGE IN ROW CACHE);
INSERT INTO clobtest VALUES (0, RPAD('a', 4000, 'a'));
UPDATE clobtest SET y = y || y || y;
INSERT INTO clobtest SELECT rownum, y FROM all_objects, clobtest WHERE rownum < 1000;
CREATE TABLE clobtest2 (x INT PRIMARY KEY, z VARCHAR2(4000));
INSERT INTO clobtest2 VALUES (0, RPAD('a', 4000, 'a'));
INSERT INTO clobtest2 SELECT rownum, z FROM all_objects, clobtest2 WHERE rownum < 1000;
COMMIT;
In my tests on 10.2.0.4 and 8K block, these two queries give very similar performance:
SELECT x, DBMS_LOB.SUBSTR(y, 4000) FROM clobtest;
SELECT x, z FROM clobtest2;
Sample from SQL*Plus (I ran the queries multiple times to remove physical IO's):
SQL> SET AUTOTRACE TRACEONLY STATISTICS
SQL> SET TIMING ON
SQL>
SQL> SELECT x, y FROM clobtest;
1000 rows selected.
Elapsed: 00:00:02.96
Statistics
------------------------------------------------------
0 recursive calls
0 db block gets
3008 consistent gets
0 physical reads
0 redo size
559241 bytes sent via SQL*Net to client
180350 bytes received via SQL*Net from client
2002 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1000 rows processed
SQL> SELECT x, DBMS_LOB.SUBSTR(y, 4000) FROM clobtest;
1000 rows selected.
Elapsed: 00:00:00.32
Statistics
------------------------------------------------------
0 recursive calls
0 db block gets
2082 consistent gets
0 physical reads
0 redo size
18993 bytes sent via SQL*Net to client
1076 bytes received via SQL*Net from client
68 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1000 rows processed
SQL> SELECT x, z FROM clobtest2;
1000 rows selected.
Elapsed: 00:00:00.18
Statistics
------------------------------------------------------
0 recursive calls
0 db block gets
1005 consistent gets
0 physical reads
0 redo size
18971 bytes sent via SQL*Net to client
1076 bytes received via SQL*Net from client
68 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1000 rows processed
As you can see, consistent gets are quite higher, but SQL*Net roundtrips and bytes are nearly identical in the last two queries, and that apparently makes a big difference in execution time!
One warning though: the difference in consistent gets might become a more likely performance issue if you have large result sets, as you won't be able to keep everything in buffer cache and you'll end up with very expensive physical reads...
Good luck!
Cheers
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With