Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sync/async insert or update ElasticSearch in Python

I'm using ElasticSearch bulk Python API, Does it provide both sync and Async api?

like image 933
Jack Avatar asked Mar 14 '23 03:03

Jack


1 Answers

If by sync you mean a blocking operation

In Python, the bulk functions are synchronous. The easiest way to go it through the helper

elasticsearch.helpers.bulk(client, actions, stats_only=False, **kwargs)

it returns a tuple with summary informations. It is thus synchronous.

If by sync you mean consistency

From the bulk api:

When making bulk calls, you can require a minimum number of active shards in the partition through the consistency parameter

In python, the bulk function has a consistency parameter, allowing you to explicit how many shards must have acknowledged the change for the method to return.

If by timeout you mean a way to stop the operation after a while

If you need to limit the duration of a bulk operation, again the low level bulk() function is your friend. It takes a timeout parameter to add an explicit operation timeout.

Even more generally,

Global timeout can be set when constructing the client (see Connection‘s timeout parameter) or on a per-request basis using request_timeout (float value in seconds) as part of any API call

For example:

from elasticsearch import Elasticsearch
es = Elasticsearch()
# only wait for 1 second, regardless of the client's default
es.cluster.health(wait_for_status='yellow', request_timeout=1)

As a side note, I searched for the bulk() call in java and especially the bulk().await(). I couldn't find anything. May I ask you for your source ?

like image 143
Derlin Avatar answered May 16 '23 06:05

Derlin