I have an AWS Lambda function that needs to connect to an internal website which is behind a proxy. In my code I am doing the following:
from botocore.vendored import requests
https_proxy = "https://myproxy:myport"
proxyDict = {
"https" : https_proxy
}
request.get("https://myurl.json", proxies=proxyDict)
Running this gives me the following error message:
HTTPSConnectionPool(host='myproxyhost', port=443): Max retries exceeded with url: myurl.json (Caused by ProxyError('Cannot connect to proxy.', gaierror(-2, 'Name or service not known')))
I have tried replacing the proxied URL with google.com to confirm I can connect to other sites (without the proxy).
It looks like the address space that Lambda runs it gets blocked by the proxy.
Is there something else I need to set with requests and lambda to get this to work?
To use a proxy in Python, first import the requests package. Next create a proxies dictionary that defines the HTTP and HTTPS connections. This variable should be a dictionary that maps a protocol to the proxy URL. Additionally, make a url variable set to the webpage you're scraping from.
To enable CORS for the Lambda proxy integration, you must add Access-Control-Allow-Origin: domain-name to the output headers . domain-name can be * for any domain name. The output body is marshalled to the frontend as the method response payload.
Short description. Internet access from a private subnet requires network address translation (NAT). To give internet access to an Amazon VPC-connected Lambda function, route its outbound traffic to a NAT gateway or NAT instance in a public subnet.
The benefits of Python in AWS Lambda environmentsPython is without a doubt the absolute winner when it comes to spinning up containers. It's about 100 times faster than Java or C#. Third-party modules. Like npm, Python has a wide variety of modules available.
EDIT: After reading the question again I realised that the error is due to name resolution (-2, 'Name or service not known')
. If you are using internal Route53 for your VPC, the solution below should still work as the lambda function will use VPC's DNS servers.
It seems either the lambda function is not running on the same subnet of your proxy instance or the security group is blocking the connection. To fix it:
This script should do it:
#!/bin/bash
# Fill the variables bellow with your vpc and subnet id
VPC_ID=""
SUBNET_IDS=""
FUNCTION_NAME=""
SEC_GROUP=$(aws ec2 create-security-group --group-name 'lambda-proxy' --vpc-id $VPC_ID --description 'Lambda/proxy communication' --output text)
aws ec2 authorize-security-group-ingress --group-id ${SEC_GROUP} --protocol tcp --port 443
aws lambda update-function-configuration --function-name $FUNCTION_NAME --vpc-config SubnetIds=$SUBNET_IDS,SecurityGroupIds=$SEC_GROUP
Then assign the created security group to your instance.
Hope it helps
We can make use of lambda environment variables and can add https_proxy as a environment variable to lambda function. By which your lambda function can access the website via proxy.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With