How to persist a graph data structure in a relational database?

Tags:

graph

I've considered creating a Vertices table and an Edges table but would building graphs in memory and traversing sub-graphs require a large number of lookups? I'd like to avoid excessive database reads. Is there any other way of persisting a graph?

Side note: I've heard of Neo4j but my question is really how to conceptually represent a graph in a standard database. I am open to some NoSQL solutions like mongodb though.

227

asked Sep 03 '13 05:09

Frank Flannigan

Video Answer

2 Answers

The answer is unfortunately: Your consideration is completely right in every point. You have to store Nodes (Vertices) in one table, and Edges referencing a FromNode and a ToNode to convert a graph data structure to a relational data structure. And you are also right, that this ends up in a large number of lookups, because you are not able to partition it into subgraphs, that might be queried at once. You have to traverse from Node to Edge to Node to Edge to Node...and so on (Recursively, while SQL is working with Sets).

The point is...

Relational, Graph oriented, Object oriented, Document based are different types of data structures that meet different requirements. Thats what its all about and why so many different NoSQL Databases (most of them are simple document stores) came up, because it simply makes no sense to organize big data in a relational way.

Alternative 1 - Graph oriented database

But there are also graph oriented NoSQL databases, which make the graph data model a first class citizen like OrientDB which I am playing around with a little bit at the moment. The nice thing about it is, that although it persists data as a graph, it still can be used in a relational or even object oriented or document oriented way also (i.e. by querying with plain old SQL). Nevertheless Traversing the graph is the optimal way to get data out of it for sure.

Alternative 2 - working with graphs in memory

When it comes to fast routing, routing frameworks like Graphhopper build up the complete Graph (Billions of Nodes) inside memory. Because Graphhopper uses a MemoryMapped Implementation of its GraphStore, that even works on Android Devices with only some MB of Memory need. The complete graph is read from database into memor at startup, and routing is then done there, so you have no need to lookup the database.

answered Sep 20 '22 02:09

Jürgen Zornig

I faced this same issue and decided to finally go with the following structure, which requires 2 database queries, then the rest of the work is in memory:

Store nodes in a table and reference the graph with each node record:

Table Nodes  id  | title | graph_id --------------------- 105 | node1 | 2 106 | node2 | 2

Also store edges in another table and again reference the graph these edges belong to with each edge:

Table Edges  id | from_node_id | to_node_id | graph_id ----------------------------------------- 1  | 105          | 106        | 2 2  | 106          | 105        | 2

Get all the nodes with one query, then get all the edges with another.

Now build your preferred way to store the graph (e.g., adjacency list) and proceed with your application flow.

answered Sep 21 '22 02:09

linkinu

Related questions
                            
                                gnuplot legend overlaps graph
                            
                                Family Tree Algorithm
                            
                                Extracting segments from a list of 8-connected pixels
                            
                                What is the difference between a node and a vertex?
                            
                                How to increase node spacing for networkx.spring_layout
                            
                                What algorithm to use to determine minimum number of actions required to get the system to "Zero" state?
                            
                                pylab.hist(data, normed=1). Normalization seems to work incorrect
                            
                                How do I implement graphs and graph algorithms in a functional programming language?
                            
                                What exactly is augmenting path?
                            
                                Graph implementation C++
                            
                                What scalability issues are associated with NetworkX?
                            
                                Detecting cycles in a graph using DFS: 2 different approaches and what's the difference
                            
                                What is the distinction between sparse and dense graphs?
                            
                                Finding all disconnected subgraphs in a graph
                            
                                What are good examples of problems that graphs can solve better than the alternative? [closed]
                            
                                Rendering a dynamically created family graph with no overlapping using a depth first search?
                            
                                How do I calculate a trendline for a graph?
                            
                                Second y-axis label getting cut off
                            
                                Does anyone know a good network/graph visualization software - just add data? [closed]
                            
                                What is the difference between dynamic programming and greedy approach?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With