Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS DAX ConnectionException when accessing with VPC peering from lambda

I have an AWS lambda function in a VPC on AWS account A that has a peering connection with a VPC on AWS account B containing a DAX cluster. I'm getting the following error when trying to connect to the DAX cluster from my lambda.

2021-12-17T17:29:34.096Z    279f4ed8-a6ea-4f50-b1d7-31c307cc3f30    ERROR   Failed to pull from my-cluster.v3fh7d.dax-clusters.us-east-1.amazonaws.com (11.0.225.143): TimeoutError: ConnectionException: Connection timeout after 10000ms
    at SocketTubePool.alloc (/var/task/node_modules/amazon-dax-client/src/Tube.js:244:64)
    at /var/task/node_modules/amazon-dax-client/generated-src/Operations.js:215:30 {
  time: 1639762164096,
  code: 'ConnectionException',
  retryable: true,
  requestId: null,
  statusCode: -1,
  _tubeInvalid: false,
  waitForRecoveryBeforeRetrying: false
}

The relevant part of my lambda code is here.

let assumedRole;

const sts = new AWS.STS({ region: "us-east-1" });
const params = {
  RoleArn:
    "arn:aws:iam::<account-b>:role/role-to-access-dax",
  RoleSessionName: "testAssumeRoleSession" + Date.now().toString(),
  DurationSeconds: 3600,
};

try {
  assumedRole = await sts.assumeRole(params).promise();
} catch (error) {
  console.log("Failed getting sts assume role: " + error);
}

const dax = new AmazonDaxClient({
  endpoint:
    "dax://my-cluster.v3fh7d.dax-clusters.us-east-1.amazonaws.com",
  region: "us-east-1",
  accessKeyId: assumedRole.Credentials.AccessKeyId,
  secretAccessKey: assumedRole.Credentials.SecretAccessKey,
  sessionToken: assumedRole.Credentials.SessionToken,
  httpOptions: { timeout: 150000 },
  maxRetries: 1,
});

const dynamodb = new AWS.DynamoDB.DocumentClient({ service: dax });

try {
  const params = {
    Key: {
      userid: requestData.userid,
    },
    TableName: "my-users-table",
  };
  const result = await dynamodb.get(params).promise();

  if (result.Item == undefined || result.Item == null) {
    return createResponse(401, "Unauthorized");
  }
  return createResponse(200, JSON.stringify(result.Item));
} catch (error) {
  return createResponse(500, error);
}

The role arn:aws:iam::<account-b>:role/role-to-access-dax has the following permissions

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "dax:GetItem",
                "dax:BatchGetItem",
                "dax:Query",
                "dax:Scan",
                "dax:PutItem",
                "dax:UpdateItem",
                "dax:DeleteItem",
                "dax:BatchWriteItem",
                "dax:ConditionCheckItem"
            ],
            "Resource": "arn:aws:dax:us-east-1:<account-b>:cache/my-cluster"
        }
    ]
}

and the following trust relationship.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::<account-a>:root"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

The DAX cluster has the policy AmazonDynamoDBFullAccess.

The peering connection shows up as Active in the AWS console.

The DAX cluster's security group has an inbound rule to allow TCP traffic on port 8111 from source <account-a> / <sg-of-lambda>.

The CIDR of the Account A VPC is 10.0.0.0/24 and the CIDR of the Account B VPC is 11.0.0.0/16.

The Account A VPC's main route table has a route directing traffic with destination 11.0.0.0/16 to the peering connection. Likewise, the Account B VPC's main route table has a route directing traffic with destination 10.0.0.0/24 to the peering connection.

As an aside, the following lines in the lambda code appear to be ignored as there are quite a few retries on the DAX request and the timeout is not changing from 10000 ms.

  httpOptions: { timeout: 150000 },
  maxRetries: 1,
like image 694
harindoo Avatar asked Apr 13 '26 22:04

harindoo


1 Answers

I was able to solve this issue with the help of an AWS rep. It turns out I needed a public and private subnet in my VPC containing the lambda. The lambda itself had to be in a private subnet with the public subnet containing a NAT gateway and an internet gateway. Instead of a single route table in the VPC, I needed separate route tables for the two subnets. The private one contains the peering connection route and VPC CIDR route like I mentioned in my question but also contains a route with destination 0.0.0.0/0 with the NAT gateway as the target. The public subnet route table contains the VPC CIDR route as well as a route with destination 0.0.0.0/0 with the internet gateway as the target.

like image 89
harindoo Avatar answered Apr 16 '26 15:04

harindoo