I've been doing a lot of research in regards to elasticsearch and I seem to be stumbling on the question of whether or not a database is needed.
Current Hibernate-Search and Relational Design
My current application is written in java using hibernate, hibernate-search, and a mysql database. Hibernate search is built on lucene and automatically manages my indexes for me during database transactions. Hibernate-search will also search against the index and then pull full records from the database based on the stored pks rather than having to store your entire data model in the index. This has worked wonderfully, however as my application grows, I've continually run into scaling issues and cost do to the fact the Lucene indexes need to live on each application server and then you need another library to sync the indexes together. The other issue with this design is it requires more memory on all the application servers since the indexes are being replicated and stored with the application.
Database or No Database
Coming from the hibernate-search school of thought, I'm confused on whether or not your suppose to store your entire data model in elasticsearch and do away with the traditional database or if your suppose to store your search data in the indexes and again like hibernate-search return primary keys to pull complete records from your relational database.
Managing the Indexes
Hibernate-Search API
I also seen the following in the hibernate-search roadmap API / SPI for alternative backends http://hibernate.org/search/roadmap/
Define API / SPI abstraction to allow for future external backends integrations such as Apache Solr and Elastic Search.
I'm wondering if anybody has any input on this? Is hibernate-search capable of managing the elastic search indexes automatically for you just as it does with it's native configuration?
If No Database
What would be the drawback of not using a database for anything search related?
Since its release in 2010, Elasticsearch has become one of the world's top ten databases by popularity. Originally based on Apache's Lucene search engine, it remains an open-source product, built using Java, and storing data in an unstructured NoSQL format.
Integrate ElasticSearch and MongoDB. MongoDB is used for storage, and ElasticSearch is used to perform full-text indexing over the data. Hence, the combination of MongoDB for storing and ElasticSearch for indexing is a common architecture that many organizations follow.
Elasticsearch has the speed, scale, and flexibility your data needs — and it speaks SQL. Use traditional database syntax to unlock non-traditional performance, like full text search across petabytes of data with real-time results.
Elasticsearch (ES) is a document-oriented search engine, designed to store, retrieve and manage document-oriented, structured, unstructured, and semi-structured data. Elasticsearch uses Lucene StandardAnalyzer for indexing for automatic type guessing and more precision.
I faced a similar problem before, on a elasticsearch setup with a mysql with the data. The solution was to store only the data that was needed to be searched on elasticsearch, with a reference to the relational database. If the data on elasticsearch was enough for the request, I returned only the elasticsearch record. If it wasn't I went to the relational database and returned that record instead.
I divided in these two processes because of the lag that the relational database introduced (it was an API for a high demand web service, elasticsearch was faster). That introduced a synchronization problem, but that was not critical on my application and we pulled periodically the data from the relational db and reindexed only the changed data set on elasticsearch. Elasticsearch can reindex only a subset of records.
We considered not using a db and storing everything in the search engine, but it depends on the importance of your data. If you can't risk losing any part of your data, don't store only on elasticsearch. We always considered the data in elasticsearch as perishable and that it the search indexes could be reconstructed from the database.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With