Specifically a Multigraph. Some colleague suggested this and I'm completely baffled. Any insights on this?

It's pretty straightforward to store a graph in a database: you have a table for nodes, and a table for edges, which acts as a many-to-many relationship table between the nodes table and itself. Like this: <pre class="prettyprint"><code>create table node ( id integer primary key ); create table edge ( start_id integer references node, end_id integer references node, primary key (start_id, end_id) ); </code></pre> However, there are a couple of sticky points about storing a graph this way. Firstly, the edges in this scheme are naturally directed - the start and end are distinct. If your edges are undirected, then you will either have to be careful in writing queries, or store two entries in the table for each edge, one in either direction (and then be careful writing queries!). If you store a single edge, i would suggest normalising the stored form - perhaps always consider the node with the lowest ID to be the start (and add a check constraint to the table to enforce this). You could have a genuinely unordered representation by not having the edges refer to the nodes, but rather having a join table between them, but that doesn't seem like a great idea to me. Secondly, the schema above has no way to represent a multigraph. You can extend it easily enough to do so; if edges between a given pair of nodes are indistinguishable, the simplest thing would be to add a count to each edge row, saying how many edges there are between the referred-to nodes. If they are distinguishable, then you will need to add something to the node table to allow them to be distinguished - an autogenerated edge ID might be the simplest thing. However, even having sorted out the storage, you have the problem of working with the graph. If you want to do all of your processing on objects in memory, and the database is purely for storage, then no problem. But if you want to do queries on the graph in the database, then you'll have to figure out how to do them in SQL, which doesn't have any inbuilt support for graphs, and whose basic operations aren't easily adapted to work with graphs. It can be done, especially if you have a database with recursive SQL support (PostgreSQL, Firebird, some of the proprietary databases), but it takes some thought. If you want to do this, my suggestion would be to post further questions about the specific queries.

Does it Make Sense to Map a Graph Data-structure into a Relational Database?

Video Answer

1 Answers

It's pretty straightforward to store a graph in a database: you have a table for nodes, and a table for edges, which acts as a many-to-many relationship table between the nodes table and itself. Like this:

create table node (
  id integer primary key
);

create table edge (
  start_id integer references node,
  end_id integer references node,
  primary key (start_id, end_id)
);

However, there are a couple of sticky points about storing a graph this way.

Firstly, the edges in this scheme are naturally directed - the start and end are distinct. If your edges are undirected, then you will either have to be careful in writing queries, or store two entries in the table for each edge, one in either direction (and then be careful writing queries!). If you store a single edge, i would suggest normalising the stored form - perhaps always consider the node with the lowest ID to be the start (and add a check constraint to the table to enforce this). You could have a genuinely unordered representation by not having the edges refer to the nodes, but rather having a join table between them, but that doesn't seem like a great idea to me.

Secondly, the schema above has no way to represent a multigraph. You can extend it easily enough to do so; if edges between a given pair of nodes are indistinguishable, the simplest thing would be to add a count to each edge row, saying how many edges there are between the referred-to nodes. If they are distinguishable, then you will need to add something to the node table to allow them to be distinguished - an autogenerated edge ID might be the simplest thing.

However, even having sorted out the storage, you have the problem of working with the graph. If you want to do all of your processing on objects in memory, and the database is purely for storage, then no problem. But if you want to do queries on the graph in the database, then you'll have to figure out how to do them in SQL, which doesn't have any inbuilt support for graphs, and whose basic operations aren't easily adapted to work with graphs. It can be done, especially if you have a database with recursive SQL support (PostgreSQL, Firebird, some of the proprietary databases), but it takes some thought. If you want to do this, my suggestion would be to post further questions about the specific queries.

158

answered Sep 20 '22 05:09

Tom Anderson

Related questions
                            
                                Storing user access level in a database
                            
                                Overhead of creating new SqlConnection in c#
                            
                                is there such thing as a query being too big?
                            
                                Random-access container that does not fit in memory?
                            
                                What's the best way to store/retrieve data for a desktop without using a database?
                            
                                Naming N:N connectivity tables
                            
                                Create/Write Permissions in MySQL
                            
                                Could someone please explain OVER
                            
                                Can mysql handle a dataset of 50gb?
                            
                                Is Software Transactional Memory the same as database transactions?
                            
                                Complex SQL query... 3 tables and need the most popular in the last 24 hours using timestamps
                            
                                Wanting a simple overview on how to connect to a SQLite database in Cocoa/Objective-C
                            
                                what should my initial database size be
                            
                                How to connect pyodbc to an Access (.mdb) Database file
                            
                                Python doesn't save data to sqlite db
                            
                                using xml as database in php [closed]
                            
                                Logical Model versus Domain Model
                            
                                How to use database as backup/failover in hibernate?
                            
                                How can I pull databases off my android onto my desktop?
                            
                                Empty a relational database schema

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Does it Make Sense to Map a Graph Data-structure into a Relational Database?

Tags:

computer-science

database

graph-theory

rrb_bbr

People also ask

Video Answer

1 Answers

Tom Anderson

Recent Activity

Donate For Us