Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does Datomic compare to Neo4j?

I am looking at integrating Neo4j into a Clojure system I am building. The first question I was asked was why I didn't use Datomic. Does anyone have a good answer for this? I have heard of and seen videos on Datomic, but I don't know enough about Graph Databases to know the difference between Neo4j and Datomic, and what difference it would make to me?

like image 767
yazz.com Avatar asked Jul 27 '13 06:07

yazz.com


People also ask

What is better than Neo4j?

TigerGraph has a longer loading time than its main competitor, Neo4j. When considering pre-processing time, TigerGraph is actually faster than Neo4j. TigerGraph is efficient because it needs 19.3x less storage space than Neo4j. TigerGraph is 24.8x faster than Neo4j on the one-hop path query.

Is Datomic a graph database?

Despite not being explicitly labeled as such, Datomic proved to be an effective graph database. Its ability to arbitrarily traverse datoms, when paired with the appropriate graph searching algorithm, solved my problem elegantly. This technique ended up being fast as well.

Which big companies use Neo4j?

Leading telcos like Verizon, Orange, Comcast, and AT&T rely on Neo4j to manage networks, control access, and enable customer 360.

What are the weaknesses of Neo4j?

Neo4j has some upper bound limit for the graph size and can support tens of billions of nodes, properties, and relationships in a single graph. No security is provided at the data level and there is no data encryption. Security auditing is not available in Neo4j.


1 Answers

There are a few fundamental difference between them:

Data Model

Both Neo4j and Datomic can model arbitrary relationships. They both use, effectively, an EAV (entity-attribute-value) schema so they both can model many of the same problem domains except Datomic's EAV schema also embeds a time dimension (i.e. EAVT) which makes it very powerful if you want to perform efficient queries against your database at arbitrary points in time. This is something that non-immutable data stores (Neo4j included) could simply not do.

Data Access

Both Neo4j and Datomic provide traversal APIs and query languages:

Queries

Both Neo4j and Datomic provide declarative query languages (Cypher and Datalog, respectively) that support recursive queries except Datomic's Datalog provides far superior querying capabilities by allowing custom filtering and aggregate functions to be implemented as arbitrary JVM code. In practice, this means Cypher's built-in functions can effectively be superseded by Clojure's sequence library. This is possible because your application, not the database, is the one running queries.

Traversal

Traversal APIs are always driven by application code, which means both Neo4j and Datomic are able to walk a graph using arbitrary traversal, filtering and data transformation code except Neo4j requires a running transaction which in practice means it's time-bounded.

Data Consistency

Another fundamental difference is that Datomic queries don't require database coordination (i.e. no read transactions) and they always work with a consistent data snapshot which means you could perform multiple queries and data transformations over an arbitrary period of time and guarantee your results will always be consistent and that no transaction will timeout (because there's none). Again, this is impossible to do in non-immutable data stores like the vast majority of existing databases (Neo4j included). This also applies to their traversal APIs.

Both Neo4j and Datomic are transactional (ACID) systems, but because Neo4j uses traditional interactive transactions -using optimistic concurrency controls-, queries need to happen inside transactions (need to be coordinated) which imposes timeout constraints to your queries. In practice, this means that for very complex, long-running queries, you'll end-up splitting your queries, so they finish within certain time limits, giving up data consistency.

Working Set

If for some reason your queries needed to involve a huge amount of data (more than it would normally fit in memory) and you couldn't stream the results (since Datomic provides streaming APIs), Datomic would probably not be a good fit since you wouldn't be taking advantage of Datomic's architecture, forcing peers to constantly evict their working memory, performing additional network calls and decompressing data segments.

like image 50
a2ndrade Avatar answered Oct 24 '22 16:10

a2ndrade