Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Graphs and version control

I have a directed graph data structure, where I am trying to implement individual version control for each vertex. This creates some interesting scenarios, and I would much appreciate any ideas that you guys have. Specifically, I am looking to resolve the default behavior of the system when encountering the stated scenarios.

See the following image: Graph versions

Scenario 1: "The Null Pointer Paradox"

Vertex A is rolled back to version 1.0. Since this rollback will cascade down its subgraph, C will no longer be pointing to D. This could create a hazard. Should the behavior be to:

  • 1.1: Delete the edge C -> D, creating a broken graph
  • 1.2: Delete D, leaving E orphaned
  • 1.3: Delete D and E
  • 1.4: Refuse to perform the rollback before all edges pointing to D (which would be E -> D in this case) are deleted
  • 1.X: Alternative solutions?

Scenario 2: "Indirect Effects"

Vertex D is updated, so that the following holds:

  • D is now version 1.2
  • E is now version 1.1
  • C is now version 1.3
  • A is now version 1.3

Vertex A is now rolled back to version 1.2, so that the following holds:

  • A is now version 1.2
  • C is now version 1.2
  • D is now version 1.1

Should the default behavior be to:

  • 2.1: Roll back E to 1.0
  • 2.2: Refuse to roll back due to version hazard, in effect impairing functionality
  • 2.X: Alternative solutions?
like image 823
awdz9nld Avatar asked Oct 30 '10 20:10

awdz9nld


3 Answers

What you are handling is a very complex problem, and although I am not aware of focalized research projects specifically on this regard, I heard some attempts to manage the issue. it's basically impossible to pull out a quick and dirty solution. I can point you to previous questions I asked on this regard (graphs and version control):

  • triplestore with revisions
  • Graph databases and RDF triplestores: storage of graph data in python

What I ended up doing was refusing to perform any kind of revision control, but instead created new nodes with new identities every time. I lose rollback, I lose any kind of tracking, but I kept the thing manageable.

In practice, the only thing you could do is to have revision control for the graph as a whole. It's very hard to have it for individual nodes. All the issues you describe arise from the fact that you are crossing-through transaction layers, where in each layer you have a consistent graph as formed in a specific moment in time. If you cross-through these layers, allowing vertex revision control, you are dealing with a can of worms, Dune sized.

like image 108
Stefano Borini Avatar answered Nov 05 '22 04:11

Stefano Borini


It seems to me that there is some confusion about the granularity here. If you only version individual vertices but not the graph, then rolling back an individual vertex should not affect the rest of the graph. If, OTOH, you want the whole graph to be rolled back, then you should also version the whole graph.

The problem is that if you only version individual vertices, then you only have integrity guarantees for individual vertices, but not for the graph as a whole. So, if, as you describe it, rolling back an individual vertex "ripples through" the whole graph (or at least the connected subgraph), then you are not guaranteed to end up in a consistent state.

The research that seems to be closest to what you are trying, is about version control for XML, which, however, only deals with strongly typed trees (IOW degenerate graphs), not general graphs.

like image 32
Jörg W Mittag Avatar answered Nov 05 '22 04:11

Jörg W Mittag


Antiquity is a versioned graph implementation for Blueprints (thus it works with neo4j too), Take a look at: https://github.com/asaf/antiquity

like image 41
asaf000 Avatar answered Nov 05 '22 06:11

asaf000