Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Combining a Graph Database and an RDBMS

Is it bad design / difficult to implement two or more databases for an application?

For instance, let's say I have User objects which I would like to store into a relational database. These 'User' objects have relationships with one another and have user feeds (think of Twitter / Facebook) and I want to store these relationships to be able to find friends of a friend, to see how "deep" I am into a chain of feeds, etc. These relationships would be stored in the graph database.

Is there any better way to go about this or would using a graph database for relationships and a relational database for data storage be the best solution?

like image 772
Jake Miller Avatar asked Dec 25 '22 03:12

Jake Miller


1 Answers

Disclaimer: I haven't used the enterprise version of neo4j yet, so it may have capabilities to help you here that I am not aware of.

If you can keep everything in neo4j, that's best as it keeps complexity low on several fronts, including data modeling, keeping data in sync, and keeping queries easy and atomic instead of splitting them between separate databases.

It would help to know what your requirements are for using RDBMS, and if those requirements justify the complexity introduced from above.

If you are determined to do this, then you've got a choice between high data redundancy, or going with a more skeletal neo4j db which only keeps IDs, relationships, and minimal data.

With high data redundancy, mirroring most if not all data in neo4j, then you've got the added complexity of keeping everything in sync and consistent between your dbs (not trivial at all). This buys you richer queries through neo4j since most data is all in the same db, and greater ability to cut off from your rdbms and go with just neo4j in the future. But any query that alters both dbs will not be automatically atomic...you'll have to do some kind of enforcement in your server side code for this, and that's likely to be tricky.

With a skeletal approach, most of your neo4j queries will have ID inputs and ID outputs. Rich data may not be available, so then you'll take those IDs and do your select on your rdbms on those ids for the data you need. Any queries involving relationships and expecting data back will require usage of both dbs, which can be troublesome for developers, though probably fine on your server-side code. You'll be avoiding issues of synchronization and probably atomicity as well, since the common data between the two dbs will be minimal.

It's worth noting that there are some solutions for integrating with other databases in a similar pattern to what you've proposed.

GraphAware has a concept of graph-aided search which offers integration between ElasticSearch and neo4j, though this is primarily to address requirements for rich searchability. This can be used to either feed ElasticSearch for use as a pure search engine, or it can let you store all your data in ES and boost or affect results based upon relational data in neo4j.

like image 187
InverseFalcon Avatar answered Dec 30 '22 22:12

InverseFalcon