Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is H2 database suitable as embedded database with large tables?

Our application currently uses H2 as an embedded database and we have the following scenario:

  • The H2 is used as a "temporary database". Data inserted in H2 is periodically sent to/inserted into an Oracle database (the "official" one) every 30 minutes by an application task, and the deleted from H2;

  • This main "temporary table" has an average of 183 rows inserted per hour in one single table.

  • We have two other large tables (21 million and 1.5 million records, respectively) used by the main application task just for querying. There's another application task that incrementally updates these tables from Oracle, updating in H2 rows that where created/updated/deleted in Oracle since last synchronization. It also happens at every 30 minutes.

We have been using H2 for 1.5 years so far with no problems, but we've found the following warning about H2 in Red Hat official documentation:

However, it should not be used in a production environment. It is a very small, self-contained datasource that supports all of the standards needed for testing and building applications, but is not robust or scalable enough for production use.

Is H2 designed and reliable to be used in production environment in a scenario like this one?

Are there any benchmarks that support this? The H2 official performance benchmark shows execution times and performance usage, but do not say anything about data volume.

like image 659
Joaquim Oliveira Avatar asked Apr 25 '16 18:04

Joaquim Oliveira


1 Answers

Is the embedded H2 database in-memory or persistent? My primary concern would be failover. If your JVM goes down, you'll lose any data since the last Oracle sync point.

Beyond that, the embedded H2 database can only be used within the same JVM as your application. Therefore, you would not be able to scale out to a high-availability architecture with multiple JVMs, each JVM would have its own H2 database and you wouldn't be able to share that data across JVMs.

Finally, there's the issue of heap. If you're using an in-memory JVM, then your heap will grow as the volume of data does, and evenually you may run out of RAM and cause cause thrashing as the garbage collector tries to keep you from running out of heap.

Other limits may be found here: http://www.h2database.com/html/advanced.html#limits_limitations

like image 192
Dean Clark Avatar answered Sep 19 '22 23:09

Dean Clark