Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Modeling an ordered tree with neo4j

I'm just getting started with neo4j, and I understand the principles of the graph and relationships, but I'm having a little bit of trouble with certain structures I want to model. I wanted to use it on a programming language project, and store the AST of a parsed source file. From there, I plan on adding a lot of additional data and relationships to the nodes to help with analysis and tooling, but the fundamental AST is still a little difficult.

The naive way of making a tree would be to simply walk the AST and copy every node in the tree to a node in neo4j, using properties to keep track of token data, etc. and then using a CHILD relationship to point to the child nodes. The problem is that when I later want to traverse the tree, I need to be able to do it in the correct order of the original AST, but out of the box I'm not quite sure the best way to do that.

I have two basic approaches I'm thinking of off the top of my head. One is to just add an index/ordinal property to each CHILD relationship. The other is to have a FIRST relationship to the first child and a NEXT relationship between each child to maintain order that way.

For either of these approaches it still doesn't seem like there's anything out of the box I can use to traverse this in the correct order. I think if I do FIRST/NEXT, I can get the correct order as long as I force neo4j to always traverse FIRST first and do a depth first search. Would that work? Is there a better way? This seems like something that should be handled easier out of the box.

UPDATE

Ultimately I decided to use both of my ideas. Child nodes have a CHILD relationship with an index property. The first child also has FIRST_CHILD relationship. Sibling nodes have a NEXT_SIBLING relationship to give the correct ordering. After that, the traversal was easy:

//reusable traversal description
final private TraversalDescription AST_TRAVERSAL = Traversal.description()
    .depthFirst()
    .expand(new OrderedByTypeExpander()
        .add(RelType.FIRST_CHILD, Direction.OUTGOING)
        .add(RelType.NEXT_SIBLING, Direction.OUTGOING));

and then when I actually needed to walk the tree I could just do

for(Path path : AST_TRAVERSAL.traverse(astRoot)){
    //do stuff here
}

For my use case, I don't actually modify the tree structure itself after creation - I just perform analysis and add more relationships and properties, so this is easy to maintain. If I had to do more modification, it might be a little bit of work, especially if I want to maintain the index numbers on child relations. So that might be something to consider for someone else in a similar situation.

If I did get into something more mutable, I would likely try out the collections Peter Neubauer suggested, and would probably just create a OrderedTreeNode class pointing to a node and using the List collection for children.

like image 867
Russell Leggett Avatar asked Jan 31 '12 14:01

Russell Leggett


People also ask

What are the disadvantages of Neo4j?

Neo4j has some upper bound limit for the graph size and can support tens of billions of nodes, properties, and relationships in a single graph. No security is provided at the data level and there is no data encryption. Security auditing is not available in Neo4j.

Is Neo4j faster than MySQL?

For the simple friends of friends query, Neo4j is 60% faster than MySQL. For friends of friends of friends, Neo is 180 times faster. And for the depth four query, Neo4j is 1,135 times faster.

What elements of a Neo4j graph are used to categorize entities?

Neo4j Property Graph ModelNodes must have labels to categorize entities. Relationships must have direction and type. Nodes and relationships can have properties. A label is used to categorize a set of nodes.

What is Graphdb Neo4j?

Neo4j is a graph database. A graph database, instead of having rows and columns has nodes edges and properties. It is more suitable for certain big data and analytics applications than row and column databases or free-form JSON document databases for many use cases. A graph database is used to represent relationships.


2 Answers

Russel, we are working on things like that, have a ordered time tree in the works that is structured much along the lines of different YEAR-2012->MONTH-01->DAY-21->VALUE123 and will probably have NEXT relationships between e.g. MONTHs of the same year.

Otherwise, if you do this, consider contributing it or investigating the stuff in https://github.com/neo4j/graph-collections, contribution and testing there is highly appreciated!

like image 73
Peter Neubauer Avatar answered Oct 04 '22 14:10

Peter Neubauer


For the benefit of anyone finding this more than 2 years later, there finally is a library that supports time trees out of the box (disclaimer: I'm one of the authors):

https://github.com/graphaware/neo4j-timetree

like image 32
Michal Bachman Avatar answered Oct 04 '22 15:10

Michal Bachman