Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the pitfalls for using ElasticSearch as a nosql db for a social application vs a graph database?

Our company has several products and several teams. One team is in charge of searching, and is standardizing on Elasticsearch as a nosql db to store all their data, with plans to use Neo4j later to compliment their searches with relationship data.

My team is responsible for the product side of a social app (people have friends, and work for companies, and will be colleagues with everyone working at their companies, etc). We're looking at graph dbs as a solution (after abandoning the burning ship that is n^2 relationships in rdbms), specifically neo4j (the Cypher query language is a beautiful thing).

A subset of our data is similar to the data used by the search team, and we will need to make sure search can search over their data and our data simultaneously. The search team is pushing us to standardize on ElasticSearch for our db instead of Neo4j or any graph db. I believe this is for the sake of standardization and consistency.

We're obviously coming from very different places here, search concerns vs product concerns. He asserts that ElasticSearch can cover all our use cases, including graph-like queries to find suggestions. While that's probably true, I'm really looking to stick with Neo4j, and use an ElasticSearch plugin to integrate with their search.

In this situation, are there any major gotchas to choosing ElasticSearch over Neo4j for a product db (or vice versa)? Any guidelines or anecdotes from those who have been in similar situations?

like image 669
InverseFalcon Avatar asked Jun 30 '16 23:06

InverseFalcon


1 Answers

We are heavy users of both technologies, and in our experience you would better use both to what they are good for.

Elasticsearch is a super good piece of software when it comes to search functionalities, logs management and facets.

Despite their graph plugin, if you want to use a lot of social network and alike relationships in elasticsearch indices, you will have two problems :

  1. You will have to update documents everytime a relationship changes, which can come to a lot when a single entity changes. For example, let's say you have organizations having users which are doing contributions on github, and you want to search for organizations having the top contributors in a certain language, everytime a user is doing a contribution on github you will have to reindex the whole organization, compute percentage of contributions of languages for all users etc... And this is a simple example.

  2. If you intend to use nested fields and partent/child mapping, you will loose performance during search, in reference, the quote from the "tuning for search" documentation here : https://www.elastic.co/guide/en/elasticsearch/reference/master/tune-for-search-speed.html#_document_modeling

Documents should be modeled so that search-time operations are as cheap as possible.

In particular, joins should be avoided. nested can make queries several times slower and parent-child relations can make queries hundreds of times slower. So if the same questions can be answered without joins by denormalizing documents, significant speedups can be expected.

Relationships are very well handled in a graph database like neo4j. Neo4j on the contrary lacks search features elasticsearch provides, doing full_text search is possible but not so performant and introduces some burden in your application.

Note apart : when you talk about "store", elasticsearch is a search engine not a database (while being used a lot as it), while neo4j is a database fully transactional.

However, combining both is the winning process, we have actually written an article describing this process that we call Graph-Aided Search with a set of open source plugins for both Elasticsearch and Neo4j providing you a powerful two-way integration out of the box.

You can read more about it here : http://graphaware.com/neo4j/2016/04/20/graph-aided-search-the-rise-of-personalised-content.html

like image 79
Christophe Willemsen Avatar answered Nov 15 '22 14:11

Christophe Willemsen