elasticsearch "block until refresh"/"wait for doc to be searchable" alternatives

Tags:

I need to index/update a document in Elasticsearch and wait until it is searchable (refresh has been done). There is a related issue on Github: https://github.com/elasticsearch/elasticsearch/issues/1063

I won't force the refresh because it impacts indexing performances and I will need to perform this operation really often. I tried to wait for 1 second as described in the Github issue. It works really well as long as Elasticsearch is not under pressure, but when there is not much RAM left (which might happen occasionally) I have seen the refresh take up to 5 or 6 seconds. Thus I tried another way.

I have written an helper function in my backend that waits for the “searchable” document to reach a given version. It is quite simple:

- GET the document with realtime=false
- if there is a result
    - if result.version >= wanted.version.
        Return
    - else
        wait a little more and retry
- else if the doc is not found
    - HEAD the document with realtime=true (test if the doc exists in the transaction log)
        - if the doc is found (then it has just been created)
            wait a little more and retry
        - else
            Return. (the doc might have been created and deleted really fast)

The wanted version is the version returned by elasticsearch when the document has been indexed.

This algorithm works but you can see that it is far from being perfect.

first it will make more calls to elasticsearch when it is under pressure, which is not a really good idea.
I have seen elastic search reset the version number when a doc has been deleted for some time. If for some reason the function misses that, we might wait until the doc reaches this version again. (that’s why I also added a timeout).

Does someone have a better solution? Scaling automatically is not an acceptable answer right now.

705

asked Jan 23 '15 14:01

nharraud

1 Answers

As Guillaume Massé said, a solution is about to be merged in Elasticsearch https://github.com/elastic/elasticsearch/issues/1063#issuecomment-223368867

Thus I would advise to wait for the builtin functionality rather than implementing a custom solution.

138

answered Oct 29 '22 13:10

nharraud

Related questions
                            
                                Appfabric Cache is performing 4x slower than SQL Server 2008 ??
                            
                                Is it better to check a variable before setting its value in C++?
                            
                                Renaming a 900kb pdf file takes long time
                            
                                Assigning values to array slices is slow
                            
                                Does the inclusion of z-index increase browser processing times?
                            
                                MySQL query by date with big inverval
                            
                                websocket scalability
                            
                                Getting all keys in a dict that overlap with other keys in the same dict
                            
                                Prevent images from being downloaded to page on mobile site
                            
                                Performance gain by using bulk inserts vs regular inserts in MongoDB
                            
                                MATLAB: GUI progressively getting slower
                            
                                Webrick and Thin are really slow serving static files in Windows. How can I speed them up?
                            
                                What is the point of storing JS and CSS in localstorage?
                            
                                Slow GroupAggregate in PostgreSQL
                            
                                CSS media queries: one file vs. separate files and impact on loading speed
                            
                                Loading multiple images from S3 on a Rails 4 app: slow loading page
                            
                                Postgresql performance comparison between arrays and joins
                            
                                Why does braces take time on C# code?
                            
                                First query with ODP.NET is always slow
                            
                                Loading jQuery after AngularJS (instead of before)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

elasticsearch "block until refresh"/"wait for doc to be searchable" alternatives

Tags:

performance

optimization

elasticsearch

nharraud

People also ask

1 Answers

nharraud

Recent Activity

Donate For Us