Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Oracle CLOB performance

I am running queries against an Oracle 10g with JDBC (using the latest drivers and UCP as DataSource) in order to retrieve CLOBs (avg. 20k characters). However the performance seems to be pretty bad: the batch retrieval of 100 LOBs takes 4s in average. The operation is also neither I/O nor CPU nor network bound judging from my observations.

My test setup looks like this:

PoolDataSource dataSource = PoolDataSourceFactory.getPoolDataSource();
dataSource.setConnectionFactoryClassName("...");
dataSource.setConnectionPoolName("...");
dataSource.setURL("...");
dataSource.setUser("...");
dataSource.setPassword("...");

dataSource.setConnectionProperty("defaultRowPrefetch", "1000");
dataSource.setConnectionProperty("defaultLobPrefetchSize", "500000");

final LobHandler handler = new OracleLobHandler();
JdbcTemplate j = new JdbcTemplate(dataSource);

j.query("SELECT bigClob FROM ...",

        new RowCallbackHandler() {

            public void processRow(final ResultSet rs) throws SQLException {

                String result = handler.getClobAsString(rs, "bigClob");

            }

        });

}

I experimented with the fetch sizes but to no avail. Am I doing something wrong? Is there a way to speed up CLOB retrieval when using JDBC?

like image 486
yawn Avatar asked Oct 06 '09 13:10

yawn


3 Answers

The total size of the result set is in the ten thousands - measured over the span of the whole retrieval the initial costs

Is there an Order By in the query? 10K rows is quite a lot if it has to be sorted.

Also, retrieving the PK is not a fair test versus retrieving the entire CLOB. Oracle stores the table rows with probably many in a block, but each of the CLOBs (if they are > 4K) will be stored out of line, each in a series of blocks. Scanning the list of PK's is therefore going to be fast. Also, there is probably an index on the PK, so Oracle can just quickly scan the index blocks and not even access the table.

4 seconds does seem a little high, but it is 2MB that needs to be possible read from disk and transported over the network to your Java program. Network could be an issue. If you perform an SQL trace of the session it will point you at exactly where the time is being spent (disk reads or network).

like image 146
Stephen ODonnell Avatar answered Nov 13 '22 21:11

Stephen ODonnell


My past experience of using oracle LOB type data to store large data has not been good. It is fine when it is under 4k since it store it locally like varchar2. Once it is over 4k, you start seeing performance degrade. Perhaps, things may have improved since I last tried it a couple of years ago, but here are the things I found in the past for your information:

As clients need to get LOBs via oracle server, you may consider the following interesting situation.

  • lob data will compete limited SGA cache with other data type if oracle decide to cache it. As clob data are general big, so it may push other data
  • lob data get poor disk read if oracle decide not to cache it, and stream the data to the client.
  • fragmentation is probably something that you haven't encountered yet. You will see if your applications delete lobs, and oracle tries to reuse the lob. I don't know if oracle support online defragmenting the disk for lob (they have for indexes, but it takes long time when we tried it previous).

You mentioned 4s for 100 lobs of avg 20k, so it's 40ms per lobs. Remember each lob needs to have to retrieved via separate Lob locater (it is not in the result set by default). That is an additional round trip for each lob, I assume (I am not 100% sure on this since it was a while ago) If that is the case, I assume that will be at least 5ms extra time per round trip in serial order, right? If so, your performance is already first limited by sequential lob fetches. You should be able to verify this by tracking the time spent in sql execution vs lob content fetching. Or you can verify this by excluding the lob column as suggested by the previous answer in the post, which should tell you if it is lob related.

Good luck

like image 6
Oscar Chan Avatar answered Nov 13 '22 22:11

Oscar Chan


I had a similar issue and found the JDBC Lobs making a network call when accessin the lobs.

As of Oracle 11.2g JDBC Driver you can use a prefetch. This speeded up access by 10 times...

statement1.setFetchSize(1000);
if (statement1 instanceof OracleStatement) {
    ((OracleStatement) statement1).setLobPrefetchSize(250000);
}
like image 6
Robin Morris Avatar answered Nov 13 '22 22:11

Robin Morris