Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Perfomance SQL Server 2017 Graph vs Neo4j

I am researching about graph databases. I stumbled into SQL Server 2017 and learned that they added the option to use a graph database. But I have some uncertainties about the performance. I watched several Youtube videos, tutorials and papers about this SQL Server 2017 Graph. For example this page.

Image, lookup in SQL-server

With the image above in mind. When I try to find a node, is it true that the time complexity is O(n)? And is the performance in other graph databases like Neo4j similar? I am only talking about node lookup and not shortest path algorithms etc.

I also have a feeling that the graph functionality in SQL Server is just a relational database in disguise. Is this correct?

Thanks in advance.

like image 490
M1sterPl0w Avatar asked Feb 15 '19 08:02

M1sterPl0w


People also ask

Is Neo4j the best graph DB?

Neo4j has the most popular and active graph database community.

Is graph DB fast?

A graph in a graph database can be traversed along specific edge types or across the entire graph. In graph databases, traversing the joins or relationships is very fast because the relationships between nodes are not calculated at query times but are persisted in the database.

Are graph databases faster than relational databases?

Complex queries typically run faster in graph databases than they do in relational databases. Relational databases require complex joins on data tables to perform complex queries, so the process is not as fast.


1 Answers

There is a big difference between a graph database and a relational database with graph capabilities, in the sense of how the data is stored.

To summarise simply, when a triple ( aka 2 nodes connected by a relationship ) is stored, the underlying database difference will be :

  • Neo4j, the triple is stored as a graph on disk, nodes have pointers to the relationships they have, so during retrieval it will just be pointer chasing from nodes
  • SQL like : one node is stored in one table, the other node is stored in another table, yet you can query as a graph but the operation will be really making a JOIN

Based on those two facts, we can say that in native graphs the join is performed at write time compared to having joins at query time in non-native graphs.

Be very careful when you hear distributed graphs, partitions, planet scale and the like. If you start having relationships that have to be traversed over the network you will always suffer performance issues. Most of the distributed graphs platforms note also that for maximum performance you have to store everything on one partition (which defeats the partitioning purpose).

like image 162
Christophe Willemsen Avatar answered Oct 20 '22 09:10

Christophe Willemsen