Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a Python ElasticSearch client that supports asynchronous requests?

I'm looking for an ElasticSearch Python client that can make asynchronous requests. For example, I'd like to write this code,

query1_future = es.search('/foobar', query1_json)
query2_future = es.search('/baz', query2_json) # Submit query 2 right after query 1, don't wait for its response
query1 = query1_future.get()
query2 = query2_future.get()

However, I don't see any clients (PyES, or the official client, for example) supporting this. Further, the two I'm familiar with couple the request logic with the response processing logic, so modifying them myself seems difficult. Perhaps a sufficient interim solution would be to use the asynchronous version of Requests, grequests?

Also, it's worth pointing out that ElasticSearch's _msearch may be a better-performing option, but for real-world applications it'd require some code restructuring.

like image 899
gatoatigrado Avatar asked Oct 09 '13 18:10

gatoatigrado


People also ask

What is async Elasticsearch?

elasticsearch-async 6.2. 0 This is an adapter for elasticsearch-py providing a transport layer based on Python's asyncio module. All API calls now return a future wrapping the response. Sniffing (when requested) is also done via a scheduled coroutine.

How do I test Elasticsearch connection in python?

Use cURL and make an XGET request You can check if a cluster in Elasticsearch is running with the integral requests library. Alternatively, you can use a method of Elasticsearch's library low-level client. Try this cURL rquest to check if a cluster in Elasticsearch is active.


2 Answers

Just came across this question. There is an official asynchronous Elasticsearch client based on asyncio:

https://github.com/elastic/elasticsearch-py-async

like image 129
Tomáš Linhart Avatar answered Oct 31 '22 08:10

Tomáš Linhart


You can also consider the following options to perform I/O without blocking main executing process using existent clients:

  • Use multithreading on Jython or IronPython (they do not have GIL and take advantage of multiple CPU cores)
  • Use ProcessPoolExecutor on Python3
  • Use gevent with sockets monkey pathching to force existent clients work with gevent sockets that actually makes the client asynchronous but also request some additional code to manage results

Gevent usage is the most lightweight (for RAM / CPU resources) and allows processing of the most intensive I/O, but it's also the most complex among the listed solutions. Also note that it works in the single process and to use advantage of multiple cores multiprocessing package should be used.

like image 3
luart Avatar answered Oct 31 '22 10:10

luart