Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can you partition a graph database? If so, how?

I know that databases in general can scale horizontally using master/slave replication. This is a great strategy when the number of concurrent reads is growing.

As the number of concurrent writes or just the amount of data starts to grow, though, master/slave replication doesn't get you anything, so you need to partition your data instead.

This works great for key-value scenarios. A classic example to me is TinyURL/bit.ly; reading/writing the data for short URL foo can be totally independent of reading/writing data for short URL bar.

But what are you supposed to do if you're in a graph scenario? More concretely, is it possible to partition a graph database like Neo4j at all? If so, how?

I can't wrap my head around how you could possibly break up a graph without defeating the purpose of using a graph database (efficient traversals).

like image 220
Aseem Kishore Avatar asked Mar 17 '11 18:03

Aseem Kishore


1 Answers

You rarely traverse an entire graph structure.

Further, graph structures are rarely heavily connected among all the nodes.

With a little care, you can locate clusters of well connected nodes separated by a small number of connections to other clusters.

http://en.wikipedia.org/wiki/Cluster_analysis

If you partition based on clustering, then traversal within the cluster may be faster, but traversal to another cluster will be slower.

Overall benefit of partitioning depends on the ratio of in-cluster traversals compared with between-cluster traversals.

like image 170
S.Lott Avatar answered Sep 30 '22 16:09

S.Lott