Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ElasticSearch updates are not immediate, how do you wait for ElasticSearch to finish updating it's index?

People also ask

How long does it take to update Elasticsearch?

Elasticsearch do near real-time search. The updated/indexed document is not immediately searchable but only after the next refresh operation. The refresh is scheduled every 1 second. To retrieve a document after updating/indexing, you should use GET api instead.

How do I refresh Elasticsearch index?

By default, Elasticsearch periodically refreshes indices every second, but only on indices that have received one search request or more in the last 30 seconds. You can change this default interval using the index. refresh_interval setting.

What is Elasticsearch flush?

Flushing a data stream or index is the process of making sure that any data that is currently only stored in the transaction log is also permanently stored in the Lucene index.


As of version 5.0.0, elasticsearch has an option:

 ?refresh=wait_for

on the Index, Update, Delete, and Bulk api's. This way, the request won't receive a response until the result is visible in ElasticSearch. (Yay!)

See https://www.elastic.co/guide/en/elasticsearch/reference/master/docs-refresh.html for more information.

edit: It seems that this functionality is already part of the latest Python elasticsearch api: https://elasticsearch-py.readthedocs.io/en/master/api.html#elasticsearch.Elasticsearch.index

Change your elasticsearch.update to:

elasticsearch.update(
     index='blog',
     doc_type='blog'
     id=1,
     refresh='wait_for',
     body={
        ....
    }
)

and you shouldn't need any sleep or polling.


Seems to work for me:

els.indices.refresh(index)
els.cluster.health(wait_for_no_relocating_shards=True,wait_for_active_shards='all')

If you use bulk helpers you can do it like this:

from elasticsearch.helpers import bulk    
bulk(client=self.es, actions=data, refresh='wait_for')