I am trying to use boto3 in python3.6 to connect to my Redshift cluster using the get_cluster_credentials API. The following code times out 100% of the time when the Lambda function is added to the VPC. It runs without issue when Lambda is not added to the VPC.
I can't figure out if get_cluster_credentials uses the public or private IP to access Redshift. I also can't figure out if there is a way to force it to use one or the other.
import json
import boto3
def lambda_handler(event, context):
redshiftClient = boto3.client('redshift', region_name='us-east-1')
cluster_creds = redshiftClient.get_cluster_credentials( DbUser='awsuser',
DbName='dev',
ClusterIdentifier='redshift-cluster-1',
AutoCreate=False)
print(cluster_creds)
return {
'statusCode': 200,
'body': json.dumps('Hello from Lambda!')
}
My configuration is very simple. The NACL lets everything (0.0.0.0/0) through on all ports and protocols. MY SG does the same thing.
I have 1 internet gateway defined: igw-0d1e6dcbfdea792b2
I have 1 subnet and 1 routing table in the VPC. The routing table has one rule to map 0.0.0.0/0 --> igw-0d1e6dcbfdea792b2.
I am able to connect from outside AWS to the cluster using SQL Workbench/J without issue.
I have looked at many posts, threads and documents, but cannot figure out what is happening:
AWS Lambda times out connecting to RedShift
Connect Lambda to Redshift in Different Availability Zones
https://github.com/awslabs/aws-lambda-redshift-loader/issues/86
Accessing Redshift from Lambda - Avoiding the 0.0.0.0/0 Security Group
https://aws.amazon.com/blogs/big-data/a-zero-administration-amazon-redshift-database-loader/
Conecting AWS Lambda to Redshift - Times out after 60 seconds
Please help.
Thanks a lot.
As per your other question, when an AWS Lambda function is added to a VPC, it does not receive a Public IP address. Therefore, if the function wishes to access the Internet (in this case to make the get_cluster_credentials() call), you should:
0.0.0.0/0It will not work if you have only one subnet, since the Lambda function will not be able to access the NAT Gateway.
I have also had success manually assigning an Elastic IP address to the Lambda function's ENI (instead of using a NAT Gateway), but this will not scale because Lambda might deploy additional containers and therefore additional ENIs. It might be sufficient if the function runs rarely and never concurrently.
You should be able to connect to RedShift directly from the VPC without an Internet or NAT gateway. This is what AWS PrivateLink is for and RedShift is supported.
A generic description of the process (service specific variations apply):
Now, in your code when you create the client, you need to define the region and the endpoint for the client.
Disclaimer: I've not done this for RedShift, but I have done it for STS and it works.
Creating an interface endpoint docs
docs for RedShift specifically
list of resources that support AWS PrivateLink
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With