Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Oracle JDBC prefetch: how to avoid running out of RAM/how to make oracle faster high latency

Using Oracle java JDBC (ojdbc14 10.2.x), loading a query with many rows takes forever (high latency environment. This is apparently the default prefetch in Oracle JDBC is default size "10" which requires a round trip time once per 10 rows. I am attempting to set an aggressive prefetch size to avoid this.

 PreparedStatement stmt = conn.prepareStatement("select * from tablename");
 statement.setFetchSize(10000);
 ResultSet rs = statement.executeQuery();

This can work, but instead I get an out of memory exception. I had presumed that setFetchSize would tell it to buffer "that many rows" as they come in, using as much RAM as each row requires. If I run with 50 threads, even with 16G of -XMX space, it runs out of memory. Feels almost like a leak:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at java.lang.reflect.Array.newArray(Native Method)
    at java.lang.reflect.Array.newInstance(Array.java:70)
    at oracle.jdbc.driver.BufferCache.get(BufferCache.java:226)
    at oracle.jdbc.driver.PhysicalConnection.getCharBuffer(PhysicalConnection.java:7422)
    at oracle.jdbc.driver.OracleStatement.prepareAccessors(OracleStatement.java:983)
    at oracle.jdbc.driver.T4CTTIdcb.receiveCommon(T4CTTIdcb.java:273)
    at oracle.jdbc.driver.T4CTTIdcb.receive(T4CTTIdcb.java:144)
    at oracle.jdbc.driver.T4C8Oall.readDCB(T4C8Oall.java:771)
    at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:346)
    at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:186)
    at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:521)
    at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:205)
    at oracle.jdbc.driver.T4CPreparedStatement.executeForDescribe(T4CPreparedStatement.java:861)
    at oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1145)
    at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1267)
    at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3449)
    at oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedStatement.java:3493)
    at oracle.jdbc.driver.OraclePreparedStatementWrapper.executeQuery(OraclePreparedStatementWrapper.java:1491)
    ....

What can I do to still get prefetch but not run out of RAM? What is going on?

The closest related item on SO is this: https://stackoverflow.com/a/14317881/32453

like image 608
rogerdpack Avatar asked Jan 27 '15 00:01

rogerdpack


1 Answers

Basically, oracle's default strategy for recent ojdbc jars is to "pre allocate" an array per "prefetch" row that accommodates for the largest size conceivably possible to return from that query. For all rows. So in my case I had some VARCHAR2(4000) in there, and 50 threads (Statements) * 3 columns of varchar2's * 4000 was adding up to more than gigabytes of RAM with a setFetchSize of a few hundred [yikes]. There does not appear to be an option to say "don't pre allocate that array, just use the size as they come in." Ojdbc even keeps these preallocated buffers around between preparedstatements (cached/connection) so it can reuse them. Definitely a memory hog.

One workaround: use setFetchSize to some sane amount. Default is 10 which can be quite slow on high latency connections. Profile and only use as high of setFetchSize as actually makes significant speed improvements.

Another workaround is to determine the maximum actual column size, then replace the query with (assuming 50 is the known max actual size) select substr(column_name, 0, 50)

Other things you can do: decrease the number of prefetch rows, increase java -Xmx parameter, only select the columns you actually need.

Once we were able to use at least prefetch 400 [make sure to profile to see what numbers are good for you, with high latency we saw improvements up to prefetch size 3-4K] on all queries, performance improved dramatically.

I suppose if you wanted to be really aggressive against sparse "really long" rows you might be able to re-query when you run into these [rare] large rows.

Details ad nauseum here

like image 50
rogerdpack Avatar answered Nov 04 '22 13:11

rogerdpack