Elasticsearch / Python / Proxy

Tags:

im new to stackoverflow, so if i make a mistake im sorry.

I have to write a python script which collects some data with Elasticsearch and then write the data to a database. I am struggling collecting the data with elasticsearch, because the company i work is behind a proxy.

The script works without a proxy.. but i dont know how to pass down the proxy settings to Elasticsearch.

following code works without a proxy:

es = Elasticsearch(['https://user:[email protected]/elasticsearch'])
res = es.search(index=index, body=request, search_type="count")

i tried the following when i am behind the proxy:

es = Elasticsearch(['https://user:[email protected]/elasticsearch'], _proxy = 'http://proxy.org', _proxy_headers = {'basic_auth': 'user:pw'})
res = es.search(index=index, body=request, search_type="count")
return res

Does anyone know the keywords which i have to pass down Elasticsearch so it uses the proxy?

any help would be nice.

thanks.

308

asked Sep 25 '15 08:09

meulth

2 Answers

I got an answer on GitHub:

https://github.com/elastic/elasticsearch-py/issues/275#issuecomment-143781969

Thanks a ton again!

from elasticsearch import RequestsHttpConnection

class MyConnection(RequestsHttpConnection):
    def __init__(self, *args, **kwargs):
        proxies = kwargs.pop('proxies', {})
        super(MyConnection, self).__init__(*args, **kwargs)
        self.session.proxies = proxies

es = Elasticsearch([es_url], connection_class=MyConnection, proxies = {'https': 'http://user:[email protected]:port'})


print(es.info())

156

answered Sep 29 '22 01:09

meulth

Generally, we don't need to add extra code for proxy, the python low-level module shall be able to use system proxy (i.e. http_proxy) directly.

In the later release (at least 6.x) we can use requests module instead of urllib3 to solve this problem nicely, see https://elasticsearch-py.readthedocs.io/en/master/transports.html

# make sure the http_proxy is in system env
from elasticsearch import Elasticsearch, RequestsHttpConnection
es = Elasticsearch([es_url], connection_class=RequestsHttpConnection)

Another possible problem is search using GET method as default, it is rejected by my old cache server (squid/3.19), extra parameter send_get_body_as shall be added, see https://elasticsearch-py.readthedocs.io/en/master/#environment-considerations

from elasticsearch import Elasticsearch
es = Elasticsearch(send_get_body_as='POST')

answered Sep 29 '22 01:09

Larry Cai

Related questions
                            
                                Scrapy grab div with multiple classes?
                            
                                TypeError: sparse matrix length is ambiguous; use getnnz() or shape[0] while using RF classifier?
                            
                                Plotting with Matplotlib in Visual Studio using Python Tools for Visual Studio
                            
                                Adding column to pandas DataFrame containing list of other columns' values
                            
                                Why print in Python doesn't pause when using sleep in a loop?
                            
                                Plotting a dataframe (pandas) in pycharm, not displaying
                            
                                python: What is the cost of re-importing modules?
                            
                                Python: Reading Ftp file list with UTF-8?
                            
                                Basic multiprocessing with while loop
                            
                                Python reverse / inverse a mapping (but with multiple values for each key)
                            
                                How to read image from numpy array into PIL Image?
                            
                                Float must be a string or a number?
                            
                                How to validate / verify an X509 Certificate chain of trust in Python?
                            
                                Portfolio rebalancing with bandwidth method in python
                            
                                Python, PyDot and DecisionTree
                            
                                Plot Piecewise Function in Python
                            
                                Matching Features with ORB python opencv
                            
                                How does the @timeout(timelimit) decorator work?
                            
                                How to parse the value of Content-Type from an HTTP Header Response?
                            
                                How to assert method call order with Python Mock?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Elasticsearch / Python / Proxy

Tags:

python

proxy

elasticsearch

meulth

People also ask

2 Answers

meulth

Larry Cai

Recent Activity

Donate For Us