Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch performance with heavy usage of search_after and PIT

We are planning to integrate search_after query with UI pagination. If we use PIT and keep the search context alive between UI page fetches, which could be minutes depending on user think time, would that scale well?. We have to support at least a few hundred concurrent users.

Also how would that affect other searches and the background ingestion process?. Documentation says lucene segment merging is impacted by open PIT contexts. We have a fairly rapidly changing large index.

like image 851
vjgorla Avatar asked Oct 28 '25 03:10

vjgorla


1 Answers

Answering my own question with a response from Elastic:

For scroll requests we have a limitation for the max number of open scroll context of 500, because PIT contexts are much more lightweight, we don’t have any limit on the number of PIT contexts, so you can open as many PIT contexts as possible. We probably need to introduce some limitation though. In the worst case scenario, when you constantly open PIT contexts with a very long keep_alive parameter and constantly update your indices, you may ran out of file descriptors or heap memory, because as you rightly noticed segments used by PIT contexts are being kept and not being deleted by merge.

On other hand, if you use relatively small keep_alive, say 10-15 mins, use high enough refresh_interval not to create many segments, and regularly monitor the number of PIT contexts with GET /_nodes/stats/indices/search than probably it will work fine.

like image 71
vjgorla Avatar answered Oct 30 '25 09:10

vjgorla



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!