Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

H2 performance recommendations

Tags:

java

database

h2

I'm currently working with a somewhat larger database, and though I have no specific issues, I would like some recommendations, if anyone has any.

The database is 2.2 gigabyte (after recreation/compacting). It contains about 50 tables. One of those tables contains a blob plus some metadata. It currently has about 22000 rows. If I remove the blobs from the table (UPDATE table SET blob = null), the database size is reduced to about 200 megabyte (after recreation/compacting). The metadata is accessed a lot, the blobs however are not that often needed.

The database URL I currently use is:

jdbc:h2:D:/data;AUTO_SERVER=true;MVCC=true;CACHE_SIZE=524288

It runs in our Java VM which has 4GB max heap.

Some things I was wondering:

  • Would running H2 in a separate process have any impact on performance (for better or for worse)?
  • Would it help to have the blobs in a separate table with a 1-1 relation to the metadata? I could imagine it would help with the caching, not having the blobs in the way?
  • The internet seems divided on whether to include blobs in a database or write them to files on a filesystem with a link in the DB. Any H2-specific advise here?
like image 280
Brecht Yperman Avatar asked Jun 05 '15 07:06

Brecht Yperman


People also ask

Is H2 DB good for production?

Mainly, H2 database can be configured to run as inmemory database, which means that data will not persist on the disk. Because of embedded database it is not used for production development, but mostly used for development and testing.

Is H2 database fast?

Features of H2 DatabaseIt is an extremely fast database engine. H2 is open source and written in Java. It supports standard SQL and JDBC API. It can use PostgreSQL ODBC driver too.

Is H2 faster than MySQL?

Comparing the normalized speed of Hibernate with MySQL database server (0.35) to the normalized speed of Hibernate with H2 database server (6.1) reveals that in that case, Hibernate with H2 server is 17.4 times faster than Hibernate with MySQL server.


1 Answers

The answer for you depends on the growth rate of your blob data. If for example, your data set is going to grow at 10% per week - then there is little point of trying to extend the use of H2 to store blob data (as it will quickly out pace the available heap memory). If instead the blob data is the biggest it will ever be, then attempting to use H2 might make sense.

To answer your questions about H2:

1) Running H2 in a separate process will allow H2 claim the majority of heap space - making controlling the available heap space for H2 much more manageable. However, you'll also be adding the maintenance overhead of having a separate process to maintain and monitor. So the answer is "it depends on your operating environment and goals". If you have the people and time, running H2 in a separate process might make sense. But if that's true - then you should probably consider just running an appropriate blob storage platform instead.

2) Yes, you're correct that storing the blobs in a separate table would help with caching - in the case that you don't often need the blobs. It should also help with retrieval times, as H2 won't have to read past the blobs to find the metadata.

3) Note that "the internet" represents many thousands of people with almost as many different specific use cases. You'll need to filter down your use case into requirements, and then apply the logic you glean from others.

4) My personal advice is, if you're trying to make a scalable and maintainable platform - use the right tools. H2, or any other relational database, is most often not the right tool for storing many large blobs. I'd recommend that you investigate using a key/value store.

like image 200
Travis Collins Avatar answered Oct 02 '22 03:10

Travis Collins