We have a new project there for index a large amount of data and for provide real time. I have also complexe search with facets, full text, geospatial...
The first prototype is to index in MongoDB and next, into Elasticsearch, because I had read that Elasticsearch does not apply a checksum on stored files and the index can't be fully trusted. But since last versions (in the version 1.5), there is now a checksum and I'm guessing if we can use Elasticsearch as primary data store ? And what is the benefit to use MongoDB in addition to Elasticsearch ?
I can't find up to date answer about thoses features in Elasticsearch
Thanks a lot
Integrate ElasticSearch and MongoDB. MongoDB is used for storage, and ElasticSearch is used to perform full-text indexing over the data. Hence, the combination of MongoDB for storing and ElasticSearch for indexing is a common architecture that many organizations follow.
MongoDB is ~1.15 faster than Elasticsearch with a default-mapped index, and ~1.20 faster than a custom-mapped one.
Talking about arguments to use Mongo instead of/together with ES:
User/role management.
shield
. But it ships only for Gold/Platinum subscription for production use.Schema
Lucene
and written in Java
. The core idea of this tool - index and search documents, and working this way requires index consistency. At back end, all documents should be fitted in flat lucene
index, which requires some understanding about how ES should deal with your nested documents and values, and how you should organize your indexes to maintain balance between speed and data completeness/consistency. Working with ES requires you to keep some things about schema in mind constantly. I.e: as you can index almost anything to ES without putting corresponding mapping in advance, ES can "guess" mapping on the fly but sometimes do it wrong and sometimes implicit mapping is evil, because once it put, it can't be changed w/o reindexing whole index. So, its better to not treat ES as schemaless store, because you can step on a rake some time (and this will be pain :) ), but rather treat it as schema-intensive, at least when you work with documents, that can be sliced to concrete fields.Handling non-table-like data.
gridFS
. This gives you ability to handle large chunks of data behind the same interface. I.e., you can store binary data in Mongo and retrieve it within the same interface, from your code perspective.Well, choose the right tool for the right job. If you require searching capabilities such as full text search, faceting etc, then nothing can beat a full fledged search engine. ElasticSearch(ES) or Solr is just a matter of choice.
You can actually feed(index) documents into ES for searching and then fetch the complete details for a particular entry from MongoDB or any other database.
I can make your task easier, do take a look at my open source work that's using MongoDB, ES, Redis and RabbitMQ, all integrated at one place, here on github
Please note that the application is built in .Net C#.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With