Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to deal with Elasticsearch index delay

Here's my scenario:

I have a page that contains a list of users. I create a new user through my web interface and save it to the server. The server indexes the document in elasticsearch and returns successfully. I am then redirected to the list page which doesn't contain the new user because it can take up to 1-second for documents to become available for search in elasticsearch

Near real-time search in elasticsearch.

The elasticsearch guide says you can manually refresh the index, but says not to do it in production.

...don’t do a manual refresh every time you index a document in production; it will hurt your performance. Instead, your application needs to be aware of the near real-time nature of Elasticsearch and make allowances for it.

I'm wondering how other people get around this? I wish there was an event or something I could listen for that would tell me when the document was available for search but there doesn't appear to be anything like that. Simply waiting for 1-second is plausible but it seems like a bad idea because it presumably could take much less time than that.

Thanks!

like image 826
Troy Avatar asked Jul 19 '15 09:07

Troy


People also ask

What is indexing latency in Elasticsearch?

One of the key metrics for a search system is the indexing latency, the amount of time it takes for new information to become available in the search index. This metric is important because it determines how quickly new results show up. Not all search systems need to update their contents quickly.

Why is my Elasticsearch slow?

Slow queries are often caused byPoorly written or expensive search queries. Poorly configured Elasticsearch clusters or indices. Saturated CPU, Memory, Disk and network resources on the cluster.


1 Answers

Even though you can force ES to refresh itself, you've correctly noticed that it might hurt performance. One solution around this and what people often do (myself included) is to give an illusion of real-time. In the end, it's merely a UX challenge and not really a technical limitation.

When redirecting to the list of users, you could artificially include the new record that you've just created into the list of users as if that record had been returned by ES itself. Nothing prevents you from doing that. And by the time you decide to refresh the page, the new user record would be correctly returned by ES and no one cares where that record is coming from, all the user cares about at that moment is that he wants to see the new record that he's just created, simply because we're used to think sequentially.

Another way to achieve this is by reloading an empty user list skeleton and then via Ajax or some other asynchronous way, retrieve the list of users and display it.

Yet another way is to provide a visual hint/clue on the UI that something is happening in the background and that an update is to be expected very shortly.

In the end, it all boils down to not surprise users but to give them enough clues as to what has happened, what is happening and what they should still expect to happen.

UPDATE:

Just for completeness' sake, this answer predates ES5, which introduced a way to make sure that the indexing call would not return until the document is either visible when searching the index or return an error code. By using ?refresh=wait_for when indexing your data you can be certain that when ES responds, the new data will be indexed.

like image 71
Val Avatar answered Sep 21 '22 21:09

Val