What does Elasticsearch automatic slicing do?

Tags:

elasticsearch

What does Elasticsearch automatic slicing do? I find the documentation to be very laconic about this function. I tried searching for other explanations of this functionality, but to no avail. Neither I have managed to find what slice is in Elasticsearch.

627

asked Apr 04 '17 15:04

Jindřich Mynarz

1 Answers

Automatic slicing is a way to parallelize work for a few different endpoints, such as reindex, update by query and delete by query.

The three above APIs all work the same way by making a scroll query over the target index. Scroll queries provide a more performant way of making queries yielding big result sets than normal paged queries. Scroll queries can be further improved by slicing them.

In clear, if a query is supposed to return a big amount of hits, you can make a normal query and page through results using from/size, but that will not be performant because of deep-paging. To circumvent that issue, ES allows you to use scroll queries in order to get results in batches of N hits. Those scroll queries can further be improved by slicing them, i.e. split the scroll in multiple slices which can be consumed independently by your client application.

So, say you have a query which is supposed to return 1,000,000 hits, and you want to scroll over that result set in batches of 50,000 hits, using a normal scroll query (i.e. without slicing), your client application will have to make the first scroll call and then 20 more synchronous calls (i.e. one after another) to retrieve each batch of 50K hits.

By using slicing, you can parallelize the 20 scroll calls. If your client application is multi-threaded, you can make each scroll call use 5 (e.g.) slices, and thus, you'll end up with 5 slices of ~10K hits that can be consumed by 5 different threads in your application, instead of having a single thread consume 50K hits. You can thus leverage the full computing power of your client application to consume those hits.

The ideal number of slices should be a multiple of the number of shards in the source index. For the best performance, you should pick the same number of slices as there are shards in your source index. For that reason, you might want to use automatic slicing instead of manual slicing, as ES will pick that number for you.

143

answered Sep 23 '22 06:09

Val

Related questions
                            
                                Understanding Elastic Search
                            
                                Why do people ship logs to Logstash with NXLog and not Logstash itself?
                            
                                AWS Elasticsearch Kibana with Cognito - Missing role
                            
                                Elastic Search vs Sunspot comparison on features
                            
                                Elastic search alphabetical sorting based on first character
                            
                                Aggregation with 0 count Elastic Search
                            
                                How to find fields with mapping conflicts
                            
                                Renaming fields in elasticsearch
                            
                                Preserving order of terms in ElasticSearch query
                            
                                How to run rake in ruby-on-rails application in production?
                            
                                How do I list all stored scripts on an Elasticsearch cluster?
                            
                                Elasticsearch More Like this no result
                            
                                ElasticSearch query_string fails to parse query with some characters
                            
                                master_not_discovered_exception ElasticSearch single node
                            
                                How do I set the path.repo in Docker compose 3?
                            
                                Elasticsearch: HOW-TO delete a (cluster) setting
                            
                                Elastic NEST using Term filter on text field with inner keyword field
                            
                                Error: The 'elasticsearch' backend requires the installation of 'requests'. How do I fix it?
                            
                                How to make our customised dashboard as default dashboard on kibana
                            
                                ElasticSearch for Time Series Data [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With