Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python requests in AWS Lambda cannot connect to site behind proxy

I have an AWS Lambda function that needs to connect to an internal website which is behind a proxy. In my code I am doing the following:

from botocore.vendored import requests

https_proxy = "https://myproxy:myport"
proxyDict = { 
          "https" : https_proxy
    }
request.get("https://myurl.json", proxies=proxyDict)

Running this gives me the following error message:

HTTPSConnectionPool(host='myproxyhost', port=443): Max retries exceeded with url: myurl.json (Caused by ProxyError('Cannot connect to proxy.', gaierror(-2, 'Name or service not known')))

I have tried replacing the proxied URL with google.com to confirm I can connect to other sites (without the proxy).

It looks like the address space that Lambda runs it gets blocked by the proxy.

Is there something else I need to set with requests and lambda to get this to work?

like image 356
William Ross Avatar asked Nov 14 '17 21:11

William Ross


People also ask

How do I send HTTP request through proxy in Python?

To use a proxy in Python, first import the requests package. Next create a proxies dictionary that defines the HTTP and HTTPS connections. This variable should be a dictionary that maps a protocol to the proxy URL. Additionally, make a url variable set to the webpage you're scraping from.

How do I enable CORS in Lambda proxy integration?

To enable CORS for the Lambda proxy integration, you must add Access-Control-Allow-Origin: domain-name to the output headers . domain-name can be * for any domain name. The output body is marshalled to the frontend as the method response payload.

Does Lambda need a NAT gateway?

Short description. Internet access from a private subnet requires network address translation (NAT). To give internet access to an Amazon VPC-connected Lambda function, route its outbound traffic to a NAT gateway or NAT instance in a public subnet.

Is Python good for AWS Lambda?

The benefits of Python in AWS Lambda environmentsPython is without a doubt the absolute winner when it comes to spinning up containers. It's about 100 times faster than Java or C#. Third-party modules. Like npm, Python has a wide variety of modules available.


2 Answers

EDIT: After reading the question again I realised that the error is due to name resolution (-2, 'Name or service not known'). If you are using internal Route53 for your VPC, the solution below should still work as the lambda function will use VPC's DNS servers.

It seems either the lambda function is not running on the same subnet of your proxy instance or the security group is blocking the connection. To fix it:

  • Create a security group to allow the lambda function to connect to port 443 on your proxy host
  • Update your lambda function to use that security group AND to be executed inside your subnet:

This script should do it:

#!/bin/bash
# Fill the variables bellow with your vpc and subnet id
VPC_ID=""
SUBNET_IDS=""
FUNCTION_NAME=""

SEC_GROUP=$(aws ec2 create-security-group --group-name 'lambda-proxy' --vpc-id $VPC_ID --description 'Lambda/proxy communication' --output text)
aws ec2 authorize-security-group-ingress --group-id ${SEC_GROUP} --protocol tcp --port 443
aws lambda update-function-configuration --function-name $FUNCTION_NAME --vpc-config SubnetIds=$SUBNET_IDS,SecurityGroupIds=$SEC_GROUP

Then assign the created security group to your instance.

Hope it helps

like image 70
Filipe Avatar answered Sep 28 '22 16:09

Filipe


We can make use of lambda environment variables and can add https_proxy as a environment variable to lambda function. By which your lambda function can access the website via proxy.

like image 29
Usman Azhar Avatar answered Sep 28 '22 15:09

Usman Azhar