We are trying to optimize our dataserver application. It stores stocks and quotes over a mysql database. And we are not satisfied with the fetching performances.
- database
- table stock : around 500 lines
- table quote : 3 000 000 to 10 000 000 lines
- one-to-many association : one stock owns n quotes
- fetching around 1000 quotes per request
- there is an index on (stockId,date) in the quote table
- no cache, because in production, querys are always different
- Hibernate 3
- mysql 5.5
- Java 6
- JDBC mysql Connector 5.1.13
- c3p0 pooling
This fills up our stock object with 857 quotes object (everything correctly mapped in hibernate.xml)
session.enableFilter("after").setParameter("after", 1322910573000L);
Stock stock = (Stock) session.createCriteria(Stock.class).
add(Restrictions.eq("stockId", stockId)).
setFetchMode("quotes", FetchMode.JOIN).uniqueResult();
SQL generated :
SELECT this_.stockId AS stockId1_1_,
this_.symbol AS symbol1_1_,
this_.name AS name1_1_,
quotes2_.stockId AS stockId1_3_,
quotes2_.quoteId AS quoteId3_,
quotes2_.quoteId AS quoteId0_0_,
quotes2_.value AS value0_0_,
quotes2_.stockId AS stockId0_0_,
quotes2_.volume AS volume0_0_,
quotes2_.quality AS quality0_0_,
quotes2_.date AS date0_0_,
quotes2_.createdDate AS createdD7_0_0_,
quotes2_.fetcher AS fetcher0_0_
FROM stock this_
LEFT OUTER JOIN quote quotes2_ ON this_.stockId=quotes2_.stockId
AND quotes2_.date > 1322910573000
WHERE this_.stockId='AAPL'
ORDER BY quotes2_.date ASC
Results :
Thinking to increase performance, we've used that code that fetch only the quotes objects and we manually add them to a stock (so we don't fetch repeated infos about the stock for every line). We used createSQLQuery to minimize effects of aliases and HQL mess.
String filter = " AND q.date>1322910573000";
filter += " ORDER BY q.date DESC";
Stock stock = new Stock(stockId);
stock.addQuotes((ArrayList<Quote>) session.createSQLQuery("select * from quote q where stockId='" + stockId + "' " + filter).addEntity(Quote.class).list());
SQL generated :
SELECT *
FROM quote q
WHERE stockId='AAPL'
AND q.date>1322910573000
ORDER BY q.date ASC
Results :
String filter = " AND q.date>1322910573000";
filter += " ORDER BY q.date DESC";
Stock stock = new Stock(stockId);
Connection conn = SimpleJDBC.getConnection();
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery("select * from quote q where stockId='" + stockId + "' " + filter);
while(rs.next())
{
stock.addQuote(new Quote(rs.getInt("volume"), rs.getLong("date"), rs.getFloat("value"), rs.getByte("fetcher")));
}
stmt.close();
conn.close();
Results :
Your help is very welcome.
Both Hibernate & JDBC facilitate accessing relational tables with Java code. Hibernate is a more efficient & object-oriented approach for accessing a database. However, it is a bit slower performance-wise in comparison to JDBC.
The short answer on the fundamental difference between JDBC and Hibernate is that Hibernate performs an object-relational mapping framework, while JDBC is simply a database connectivity API. The long answer requires a history lesson on database access with Java.
Hibernate support inheritance and association mapping. This is the major benefit of Hibernate over JDBC. In JDBC there is no concept of inheritance or association mapping. In association mapping, it is easy to save and manage the entity. For example, when we save parent entity the child entity will save automatically.
Can you do a smoke test with the simples query possible like:
SELECT current_timestamp()
or
SELECT 1 + 1
This will tell you what is the actual JDBC driver overhead. Also it is not clear whether both tests are performed from the same machine.
Is there a way to optimize the performance of JDBC driver ?
Run the same query several thousand times in Java. JVM needs some time to warm-up (class-loading, JIT). Also I assume SimpleJDBC.getConnection()
uses C3P0 connection pooling - the cost of establishing a connection is pretty high so first few execution could be slow.
Also prefer named queries to ad-hoc querying or criteria query.
And will Hibernate benefit this optimization ?
Hibernate is a very complex framework. As you can see it consumes 75% of the overall execution time compared to raw JDBC. If you need raw ORM (no lazy-loading, dirty checking, advanced caching), consider mybatis. Or maybe even JdbcTemplate
with RowMapper
abstraction.
Is there a way to optimize Hibernate performance when converting result sets ?
Not really. Check out the Chapter 19. Improving performance in Hibernate documentation. There is a lot of reflection happening out there + class generation. Once again, Hibernate might not be a best solution when you want to squeeze every millisecond from your database.
However it is a good choice when you want to increase the overall user experience due to extensive caching support. Check out the performance doc again. It mostly talks about caching. There is a first level cache, second level cache, query cache... This is the place where Hibernate might actually outperform simple JDBC - it can cache a lot in a ways you could not even imagine. On the other hand - poor cache configuration would lead to even slower setup.
Check out: Caching with Hibernate + Spring - some Questions!
Are we facing something not tunable because of Java fundamental object and memory management ?
JVM (especially in server configuration) is quite fast. Object creation on the heap is as fast as on the stack in e.g. C, garbage collection has been greatly optimized. I don't think the Java version running plain JDBC would be much slower compared to more native connection. That's why I suggested few improvements in your benchmark.
Are we missing a point, are we stupid and all of this is vain ?
I believe that JDBC is a good choice if performance is your biggest issue. Java has been used successfully in a lot of database-heavy applications.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With