Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Large number of relationship types in cypher query

Tags:

neo4j

I'm prototyping a user-authorization/data-protection scheme in Neo4j, and I ran into a strange issue with one of my queries. For background, the concept is that a user trying to get from a to be can get to be if they have the correct access identifier. So, our edges are of types that have access identifiers in them. I'm testing this scheme by creating lots of nodes, and connecting pairs of them with different accesses. That is, I have lots of sets of:

(a)-[:ACCESS_A]->(b)

With different accesses. I query for them with:

{some query} with a match (a)-[:ACCESS_A|:ACCESS_B|<...>|:ACCESS_Z]->(b) return b

where the size of the list in the edge match grows with the number of accesses the user has.

This all works great, until the list gets to 201 accesses. At this point, the profile shows the db hits and time taken go WAY up. At 200 relationship types, the profile shows 1051 db hits, but 201 relationship types shows 31801. That's a 30-fold increase for one more type! Time taken increases in a similar manner. going from 199 to 200 only goes up by about 50 hits, and that's due to an increasing number of nodes hit.

After more work, it looks like the round 200 number is more a red herring than the issue. Previously, my relationship types were 4 characters. When I changed them to 9 characters (prepending "EDGE_", as a test), the issue began occurring at 50 types - 50 has 36 accesses, while 51 has 291 - a smaller jump, but significant when compared to previous increases in the same test.

It appears that there is some relation of relationship name to where the query falls off, but I'm still investigating.

Things that I've tested and not found to be of interest:

  • length of the overall query (string size): It fails at entirely different query sizes with 4 and 9 character relationship types
  • length of the list in the [e:<...>] clause (string size). As above, it fails at very different sizes
  • number of nodes or edges in the graph
like image 541
Tal Avatar asked Feb 02 '17 20:02

Tal


1 Answers

To my knowledge you shouldn't be running into performance issues with only 200 relationship types.

Prior to version 3.0, the number of relationship types was capped at 64k. That limit was removed with version 3.0.

like image 173
InverseFalcon Avatar answered Oct 25 '22 05:10

InverseFalcon