Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In Neo4j, what level of specificity should be used when granularity level can be unlimited?

The hardest thing to wrap my head around when using a graph database, is choosing level of granularity. Lets say I have a graph for things that occur at certain days of the week: trash day, taco tuesday, BYOB friday, etc.

  • I can make each day a node (Mon, Tue, Wed, ...), that way, querying for specific days is fast.
  • I can make a node called Day, and add the property name with the day of the week. That way, showing all days in a graph is easy to query for.

Thinking to myself, making nodes very specific is bad because there is not limit to granularity. For example saturday morning, evening and night, or worse, a new node per hour of each day. I could also make edges a component of the granularity by saying saturday node is linked by "evening" edge to trash day node.

I come across similar problems every now and then, for example; should I create a new node based on a person's full name, or a node called "Person" with property "name". Then I make nodes either specific or general based on convenience, but I feel there may be some best practice or higher level principle I'm missing. It's not clear to me how to judge which way is better.

like image 711
ForeverConfused Avatar asked Jan 27 '18 08:01

ForeverConfused


People also ask

What are the limitations of Neo4j?

Neo4j has some upper bound limit for the graph size and can support tens of billions of nodes, properties, and relationships in a single graph. No security is provided at the data level and there is no data encryption. Security auditing is not available in Neo4j.

What are constraints in Neo4j?

In Neo4j, a constraint is used to place restrictions over the data that can be entered against a node or a relationship. There are two types of constraints in Neo4j: Uniqueness Constraint: It specifies that the property must contain a unique value.


1 Answers

The level of granularity of your data model should be driven by your query requirements, not the other way around. That is: when modeling your database, you should ask yourself: "what kind of query I will do over my data?". Based on the answers of this question, you will get a good start point to make a good model with an appropriate granularity level.

In the book Learning Neo4j, by Rik Van Bruggen (you can download in this link) the author says about design graph databases for query-ability:

Like with any database management system, but perhaps even more so for a graph database management system such as Neo4j, your queries will drive your model. What we mean with this is that, exactly like it was with any type of database that you may have used in the past or would still be using today, you will need to make specific design decisions based on specific trade-offs. Therefore, it follows that there is no one perfect way to model in a graph database such as Neo4j. It will all depend on the questions that you want to ask of the data, and this will drive your design and your model.

So, based on this, the answer of your question "what level of specificity should be used when granularity level can be unlimited?" is: it depends on your query requirements. Think first in the queries you will do, and after in the data model.

My suggestion is: keep your model as simple as possible in the beginning and, when required, make gradual changes.

like image 61
Bruno Peres Avatar answered Oct 13 '22 09:10

Bruno Peres