I want to build a system that will have to have to answer queries in real time. I would have to update data every 1 hour and would have to add about a million documents. Can we use elastic search for this or should I go with nosql?
using elasticsearch as a cache is fair. You can easily maintain it as cache layer on your primary storage. 1)But keep an eye on your reindexing strategy. When you will be adding 1 million documents to cluster every hour it will be very heavy operation on your hardware in terms of disk I/O.
Give memory to the filesystem cacheedit Elasticsearch heavily relies on the filesystem cache in order to make search fast. In general, you should make sure that at least half the available memory goes to the filesystem cache so that Elasticsearch can keep hot regions of the index in physical memory.
As you can see in the figure below, RediSearch built its index in 221 seconds versus 349 seconds for Elasticsearch, or 58% faster.
Elasticsearch allows you to store, search, and analyze huge volumes of data quickly and in near real-time and give back answers in milliseconds. It's able to achieve fast search responses because instead of searching the text directly, it searches an index.
using elasticsearch as a cache is fair.You can easily maintain it as cache layer on your primary storage.
1)But keep an eye on your reindexing strategy.When you will be adding 1 million documents to cluster every hour it will be very heavy operation on your hardware in terms of disk I/O.
2)Also keep an eye on concurrency issue while doing bulk indexing to the cluster to tune it to optimum value by varying bulk size document, threadpool and queue size.default value of queue size for bulk indexing is 50.
Threadpool elasticsearch
Also what is your cluster architecture - Number of nodes, replicas, shards
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With