Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Storing a graph in mongodb

I have an undirected graph where each node contains an array. Data can be added/deleted from the array. What's the best way to store this in Mongodb and be able to do this query effectively: given node A, select all the data contained in the adjacent nodes of A.

In relational DB, you can create a table representing the edges and another table for storing the data in each node this so.

table 1  NodeA, NodeB NodeA, NodeC  table 2  NodeA, item1 NodeA, item2 NodeB, item3  

And then you join the tables when you query for the data in adjacent nodes. But join is not possible in MongoDB, so what's the best way to setup this database and efficiently query for data in adjacent nodes (favoring performance slightly over space).

like image 261
kefeizhou Avatar asked Feb 26 '11 07:02

kefeizhou


People also ask

Can we store graph in MongoDB?

For applications that require more advanced graph capabilities or that use graph capabilities frequently, MongoDB can also be coupled with a dedicated graph database.

How do you store a graph in a relational database?

You have to store Nodes (Vertices) in one table, and Edges referencing a FromNode and a ToNode to convert a graph data structure to a relational data structure. And you are also right, that this ends up in a large number of lookups, because you are not able to partition it into subgraphs, that might be queried at once.

Is MongoDB graph oriented?

In this chapter, we identified the advantages in designing real world application designed using one of the dynamic set of Not Only Structured Query Language (NoSQL) databases – graph databases. MongoDB being a document oriented database is not capable of processing graphs by default.


1 Answers

Specialized Distributed Graph Databases

I know this is sounds a little far afield from the OPs question about Mongo, but these days there are more specialized graph databases that excel at this kind of work and may be much easier for you to use, especially on large graphs.

There is a comparison of 7 such offerings here: https://docs.google.com/spreadsheet/ccc?key=0AlHPKx74VyC5dERyMHlLQ2lMY3dFQS1JRExYQUNhdVE#gid=0

Of the three most significant open source offerings (Titan, OrientDB, and Neo4J), all of them support the Tinkerpop Blueprints interface. So for a graph that looks like this...

enter image description here

... a query for "all the people that Juno greatly admires who she has known since the year 2011" would look like this:

Iterable<Vertex> results = juno.query().labels("knows").has("since",2011).has("stars",5).vertices() 

This, of course, is just the tip of the iceberg. Pretty powerful stuff!

If you have to stay with Mongo

Think of Tinkerpop Blueprints as the "JDBC of storing graph structures" in various databases. The Tinkerpop Blueprints API has a specific MongoDB implementation that would work for you I'm sure. Then using Tinkerpop Gremlin, you have all sorts of advanced traversal and search methods at your disposal.

like image 92
Jonathan Schneider Avatar answered Oct 12 '22 12:10

Jonathan Schneider