Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: Using Elasticsearch Scan to get more than 10,000 results ScanError

I want to query Elasticsearch and print all results for the query. The default max is 10,000, but I'd like to expand this max to much larger. I'm working with Python.

I'm using Elasticsearch.helpers.scan. It seems to work, but then during the middle of printing the results I get this error:

elasticsearch.helpers.ScanError: Scroll request has only succeeded on 66 shards out of 80.

I'm not sure what this means at all, could someone please explain and provide a solution to fix this?

Also, if theres a better/easier module/api to use other than Elasticsearch.helpers.scan, please let me know!

Thanks!

like image 951
helloworld95 Avatar asked Apr 15 '19 19:04

helloworld95


2 Answers

Pass raise_on_error=False to the scan function.

res = scan(es, query=query, scroll='50m', size=1000, raise_on_error=False)

This fixed it for me.

like image 185
Michael Gabilondo Avatar answered Oct 09 '22 18:10

Michael Gabilondo


What might indeed help to find out more information about the exception reason is quite simple - just turn on DEBUG log for Elasticsearch python modules you're using:

import logging
from elasticsearch import logger as elasticsearch_logger
elasticsearch_logger.setLevel(logging.DEBUG)

and consequently check the logs around your scan() call.

like image 36
lef Avatar answered Oct 09 '22 20:10

lef