Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pure Java alternative to database / cache for storing records

I have created an application sold to customers, some of which are hardware manufacturers with fixed constraints (slow CPU). The app has to be in java, so that it can be easily installed as a single package.

The application is multithreaded and maintains audio records. In this particular case all we have is INSERT SOMEDATA FOR RECORD, each record representing an audio file (and this can be done by different threads), and then later on we have SELECT SOMEDATA WHERE IDS in (x, y, z) by an single thread, then 3rd step is we actually DELETE all the data in this table.

The main constraint is cpu, slow single cpu. Memory is also a constraint, but only in that the application is designed so it can process an unlimited number of files, and so even if had lots of memory would eventually run out if all stored in memory rather than utilizing the disk.

In my Java application I started off using the H2 database to store all my data. But the software has to run on some slow single cpu servers so I want to reduce the cpu cycles used, and one area I want to look again is the database.

In many cases I am inserting data into database simply for the purposes of keeping the data off the heap otherwise would run out of memory, then later on we retrieve the data, we never have to UPDATE the data.

So I considered using a cache like ehCache but that has two problems:

  • It doesn't guarantee the data will not be thrown away (If the cache gets full)
  • I can only retrieve records one at a time, whereas with relational database I can retrieve a batch of records, this looks like a potential bottleneck.

What is an alternative that solves these issues ?

like image 391
Paul Taylor Avatar asked Jun 29 '26 01:06

Paul Taylor


1 Answers

You want to retrieve records in batch fast, not loose any data, but you don't need optimized queries nor updates and you want to use CPU and memory resources as effectively as possible:

Why don't you simply store your records in a file? The operating system uses any free memory for caching. So when you access your file frequently, the OS will do its best to keep as much content as possible in memory. The OS does this job anyway, so this type of caching costs you no additional CPU and no single line of code.

The only scenarios where it could make sense to invest more in optimization would be:

  • a) Your process or other processes make heavy use of the file system and pollute file cache
  • b) Serialization / deserialization is too expensive

In case of a):

Define your priorities. An explicit cache (in heap or off-heap) can help you to keep some content of selected files in memory. But this memory will not be avalaible anymore for the OS's file cache. So while you speed up one file access you potentially slow down access to other files.

In case of b):

Measure performance first, before you optimize anything. Usually disk access is the bottleneck - that's something you cannot change without replacing hardware. If you still want to optimize (e.g. because GC eats up CPU due to a very high number of temporarily created objects - i guess with only one core serial GC will be in use) then I suggest to have a closer look on Google flatbuffers.

You started with the most complex solution for your problem, a database. I suggest to start at the other end of the spectrum and keep it as simple as possible


UPDATE: The question has been edited in the meanwhile and requirements have changed. A new requirement is now that it has to be possible to read selected records by IDs.

Possible extensions:

  • Store each record in an own file and use the key as file name
  • Store all records in one file and use a file-based HashMap implementation like MapDB's HTreeMap implementation.

Independent from the chosen extension, the operating system's file cache will do its best to hold as much content as possible in main memory.

like image 151
rmunge Avatar answered Jul 01 '26 13:07

rmunge



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!