Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Howto map a tree structure in Neo4j?

Tags:

tree

neo4j

cypher

I am working on describing a tree structured data set in Neo4j. In my current model, a node can have n links to other nodes, making it the child node of those nodes:

root
 |
 \-- A
     |
     \-- 1
     |
     \-- 2
  • A links to root
  • 1 and 2 link to A
  • root is parent of A
  • 1 and 2 are children of A

Because I am using nodejs (with node-neo4j), reading of the database is restricted to using Cypher. To read the nodes I use the following query:

START n=node(1)
    -- this is the root node

MATCH (n)<-[linkedby:links*]-x, x-[linksto?:links*1]->()
    -- get all nodes that link to the root node and all nodes linking to those nodes, etc, etc and retrieve all nodes those found nodes link to

RETURN n, x, linkedby, LAST(linkedby) AS parent, COLLECT(DISTINCT linksto) AS links      
    -- the last in the path of linkedby is the direct parent

ORDER BY length(linkedby), parent.rank?
    -- ordering so that parents are fetched before their children from the result set

My problem: this query becomes very slow (>1 s), as the number of nodes and relationships rise.

Is there a better way of modeling the nodes and relationships? Do I need different relationships between parent nodes and child nodes? Or should I change the query in any way?

Thanks for any pointers!


1) The real world problem: A kind of business process tool, where services are linked to processes, organisations, team, etc to give information which services are required when and by whom and also giving information who will provide this service or is responsible.

So for instance:

Service S is used in Process P1 and P2:

P1  P2
|   |
\---+-- S

Service S is managed by Team T:

T
|
\-- S

Team T is part of Organisation O:

O
|
\-- T

Tree:

root
 |
 |
 +-- Processes
 |   |
 |   +-- P1
 |   |   |
 |   |   \-- S
 |   |
 |   \-- P2
 |   |   |
 |   |   \-- S
 |
 +-- Organisations
 |   |
 |   +-- O
 |       |
 |       \-- T
 |           |
 |           \-- S

2) My data in console.neo4j.org:

CREATE 
  (r {name:'root'}), 
  (ps {name: 'Processes'})-[:links]->r, 
    (p1 {name: 'P1'})-[:links]->ps,
    (p2 {name: 'P2'})-[:links]->ps,
      (s {name: 'Service'})-[:links]->p1,
      s-[:links]->p2,
  (org {name: 'Organisations' })-[:links]->r,
    (t {name: 'Team'})-[:links]->org,
      s-[:links]->t
like image 888
Sonata Avatar asked Sep 18 '13 09:09

Sonata


1 Answers

So, I had a talk with Michael Hunger the other day and he acknowledged that Cypher isn't good (yet) at processing recursive queries. This should change in 2.1, but until then he proposed I write an unmanaged extension, which I could then call from nodejs.

This completely solved my problem: Retrieving the tree using plain Java is about 10x faster then using Cypher.

Simplified code:

public class EntitiesStream {
    public Entity load(long nodeId) {
        Node n = database.getNodeById(nodeId);
        Entity entity = Entity.from(n);
        loadChildren(n, entity);
        return entity;
    }

    private void loadChildren(Node n, Entity e) {
        for (Relationship linksFrom: n.getRelationships(Direction.INCOMING, Relationships.links)) {
            Node startNode = linksFrom.getStartNode();
            Entity childEntity = Entity.from(startNode);
            e.addChild(((Number) linksFrom.getProperty("rank")).longValue, childEntity);
            this.loadChildren(startNode, childEntity);
        }
    }
}

public class Entity {
    private final TreeMap<Long, Entity> sortedChildren;

    Entity(long dbId) {
        this.sortedChildren = new TreeMap<>();
        // ...
    }

    public static Entity from(Node node) {
        Entity e = new Entity(node.getId());
        // ...

        return e;
    }

    public void addChild(Long rank, Entity childEntity) {
        sortedChildren.put(rank, childEntity);
    }
}
like image 191
Sonata Avatar answered Sep 21 '22 18:09

Sonata