Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Neo4j in distributed mode - is it possible?

Tags:

neo4j

I can't find any answer in the Internet to this question.
Is it possible to launch Neo4j in distributed mode to gain better performance? Why/Why not?

like image 950
CypherFancy Avatar asked May 09 '18 17:05

CypherFancy


2 Answers

It sounds like you're asking about database sharding. The short answer is no, this feature isn't supported.

Neo4j has two primary clustering modes, the older HA (highly available) clustering, and the newer Causal Clustering, and both require Enterprise Edition. In both cases all nodes participating in the cluster must contain the entire graph.

For now I'll stick with causal clustering, as that's where feature development is continuing.

As far as read scaling, that can be scaled horizontally by adding read replicas to the cluster. The bolt+routing protocol ensures that explicit read transactions using the driver are routed to either one of the followers or a read replica, and take load into account to some degree.

For write scaling, that is vertical only, as only one node at a time (the elected leader) is allowed to write, so ensuring that all core nodes (the nodes in the cluster that can potentially be elected leader) have adequate RAM, disk space, and SSDs is critical.

EDIT:

Neo4j Fabric was introduced in January 2020 with the release of Neo4j 4.0. This allows sharding of data across multiple shards (databases or clusters, and they don't need any additional configuration to be used as a shard), and ways to query over these multiple shards and work with the results.

like image 154
InverseFalcon Avatar answered Sep 30 '22 21:09

InverseFalcon


Neo4j Enterprise has clustering but it is for high availability.

It does not shard like TigerGraph for example.

Each instance (node) in the cluster has a replication of the full data set.

like image 42
John Mark Avatar answered Sep 30 '22 20:09

John Mark