I have a problem in which I need to perform CRUD operations on cyclic graphs. Now I know that there are a bunch of graph databases out there, but I have a specific set of use cases which are not supported in those databases (or at least I'm not aware of them).
Following are my constructs:
Following are the functionalities I can have:
Now I understand that all of this can be done best with a relational database which will ensure that the relationships are intact and querying is simple. But the performance will take a hit when there are complex graphs and multiple of those graphs are to be updated.
So, I was wondering if there is a hybrid/better approach to storing, retrieving and updating these graphs that would be much faster compared to relational databases.
Any ideas would be really helpful. Thanks in advance!
I wouldn't fence-out graph databases. You can easily build the missing features yourself, using extra properties/nodes/connections that serve your needs.
E.g. for creating a group, you could create a node with some prop type:Group
which shares the same groupId
, with all the nodes belonging to that group.
Another option would be for group members to have an extra connection towards their Group
: Node-belongsToGroup->GroupNode
.
In any of the above solutions, to connect a Node
/Group
to another Group
, would just require to create a connection towards the Group
node only.
The same goes for Definitions, e.g. Node-isOfType->DefinitionNode
. Then updateDefinition
would update all nodes that belong to that Definition
.
Based on the above I think it would be easy to create an api like the following:
createGroup
isGroup
addNodesToGroup
createDefinition
updateDefinition
setNodeDefinition
getNodeDefinition
As far as scalability is concearned you could check OrientDb: Distributed-Architecture / comparison to neo4j
...only one server can be the master, so the Neo4j write throughput is limited to the capacity of the single Master server. This means that Neo4j isn’t able to scale on writes.
OrientDB, instead, supports a Multi-Master + Sharded architecture: all the servers are masters. The throughput is not limited by a single server. With OrientDB, the global throughput is the sum of the throughput of all the servers.
api ref: java api / sql ref
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With