Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Recommendations for an in memory database vs thread safe data structures

TLDR: What are the pros/cons of using an in-memory database vs locks and concurrent data structures?

I am currently working on an application that has many (possibly remote) displays that collect live data from multiple data sources and renders them on screen in real time. One of the other developers have suggested the use of an in memory database instead of doing it the standard way our other systems behaves, which is to use concurrent hashmaps, queues, arrays, and other objects to store the graphical objects and handling them safely with locks if necessary. His argument is that the DB will lessen the need to worry about concurrency since it will handle read/write locks automatically, and also the DB will offer an easier way to structure the data into as many tables as we need instead of having create hashmaps of hashmaps of lists, etc and keeping track of it all.

I do not have much DB experience myself so I am asking fellow SO users what experiences they have had and what are the pros & cons of inserting the DB into the system?

like image 695
z - Avatar asked Mar 25 '10 17:03

z -


1 Answers

Well a major con would be the mismatch between Java and a DB. That's a big headache if you don't need it. It would also be a lot slower for really simple access. On the other hand, the benefits would be transactions and persistence to the file system in case of a crash. Also, depending on your needs, it allows for querying in a way that might be difficult to do with a regular Java data structure.

For something in between, I would take a look at Neo4j. It is a pure Java graph database. This means that it is easily embeddable, handles concurrency and transactions, scales well, and does not have all of the mismatch problems that relational DBs have.

Updated If your data structure is simple enough - a map of lists, map of maps, something like that, you can probably get away with either the concurrent collections in the JDK or Google Collections, but much beyond that, and you will likely find yourself recreating an in memory database. And if your query constraints are even remotely difficult, you're going to have to implement all of those facilities yourself. And then you'll have to make sure that they work concurrently etc. If this requires any serious complexity or scale(large datasets), I would definitely not roll your own unless you really want to commit to it.

If you do decided to go with an embedded DB there are quite a few choices. You might want to start by considering whether or not you want to go the SQL or the NoSQL route. Unless you see real benefits to go SQL, I think it would also greatly add to the complexity of your app. Hibernate is probably your easiest route with the least actual SQL, but its still kind of a headache. I've done it with Derby without serious issues, but it's still not straightforward. You could try db4o which is an object database that can be embedded and doesn't require mapping. This is a good overview. Like I had said before, if it were me if I would likely try Neo4j, but that could just be me wanting to play with new and shiny things ;) I just see it as being a very transparent library that makes sense. Hibernate/SQL and db4o just seems like too much hand waving to feel lightweight.

like image 73
Russell Leggett Avatar answered Oct 22 '22 20:10

Russell Leggett