Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient database query for ancestors on an acyclic directed graph

Let's say I have an acyclic directed graph such as a family "tree" (not really a tree since a child has 2 parents). I want to place a representation of this graph in a relational database so that it's fast to compute all ancestors of a node, and all descendants of a node. How would you represent this graph? How would you query for all descendants? How would you insert and remove nodes and relationships? What assumptions are you making about the data?

The best solution will have the best big O for the number of select/insert/delete statements you run to query ancestors and descendants, with ties broken by best big O for total runtime, with ties broken by space requirements.

My coworker posed this question to me. I have a solution but it's exponential size in the worst case so I wanted to see how other people would solve it.

edit

Clarified relational database. This question is trivial (and boring) if you use graph databases with built in transitive closures.

like image 336
Dave Aaron Smith Avatar asked Sep 20 '10 20:09

Dave Aaron Smith


People also ask

What is a DAG SQL?

A database availability group (DAG) is a high availability (HA) and data recovery feature of Exchange Server 2010. A database availability group, which can consist of up to 16 Exchange mailbox servers, automates recovery at the database-level after a database, server or network failure.

What is DAG directed acyclic graph give an example?

A directed acyclic graph (or DAG) is a digraph that has no cycles. Example of a DAG: Theorem Every finite DAG has at least one source, and at least one sink. In fact, given any vertex v, there is a path from some source to v, and a path from v to some sink.

What is directed acyclic graph in data structure?

A directed acyclic graph (DAG) is a conceptual representation of a series of activities. The order of the activities is depicted by a graph, which is visually presented as a set of circles, each one representing an activity, some of which are connected by lines, which represent the flow from one activity to another.

How do you show acyclic on a graph?

A directed graph is acyclic if it contains no cycles. That is, starting at any node in the graph, no sequence of edges exists that can be followed to loop back to that starting node.


2 Answers

If selects > manipulations, and especially subtree selects (all ancestors, all descendants) I'd go for a Closure-table approach. Yes, an explosion of paths in your path-table, but it does deliver results fast (as opposed to the adjacency model), and keeps updates limited to relevant portions (as opposed to 50% update with nested sets).

Bill Karwin has some nice presentation online about pros and cons of different models, see http://www.slideshare.net/billkarwin/models-for-hierarchical-data (slide 48 is an overview).

like image 152
Wrikken Avatar answered Sep 28 '22 00:09

Wrikken


For DAGs in SQL databases there appeared to be only two solutions:

  1. Recursive WITH clause.

  2. Transitive closure

I'm not aware of any practical graph labeling scheme (like nested sets,intervals or materialized path)

like image 26
Tegiri Nenashi Avatar answered Sep 28 '22 01:09

Tegiri Nenashi