I have AWS API gateway setup for a public endpoint with no auth. It connects to a websocket that triggers a Lambda.
I was creating connections with Python's websocket-client
lib at https://pypi.org/project/websocket_client/.
I noticed that connections would fail ~10% of the time, and get worse as I increased load. I can't find anywhere that would be throttling me seeing as my general API Gateway settings say Your current account level throttling rate is 10000 requests per second with a burst of 5000 requests.
. That’s beside the point that just 2-3 requests per second would trigger issue fairly often.
Meanwhile the failure response would be like {u'message': u'Forbidden', u'connectionId': u'Z2Jp-dR5vHcCJkg=', u'requestId': u'Z2JqAEJRvHcFzvg='}
I went into my CloudWatch log insights and searched for the connection ID and request ID. The log group for the API gateway would find no results with either ID. Yet a search on my Lambda that fires on websocket connect, would have a log with that connection ID. The log showed everything running as expected on our side. The lambda simply runs a MySQL query that fires.
Why would I get a response of forbidden, despite the lambda working as expected?
The existing question over at getting message: forbidden reply from AWS API gateway, seems to address if it's ALWAYS returning forbidden for some private endpoints. Nothing lined up with my use case.
UPDATE
I think this may be related to locust.io
, or python, which I'm using to connect every second. I installed https://www.npmjs.com/package/wscat on my machine and am connecting and closing as fast as possible repeatedly. I am not getting a Forbidden
message. It's just extra confusing since I'm not sure how the way I connect would randomly spit back a Forbidden
message some of the time.
class SocketClient(object):
def __init__(self, host):
self.host = host
self.session_id = uuid4().hex
def connect(self):
self.ws = websocket.WebSocket()
self.ws.settimeout(10)
self.ws.connect(self.host)
events.quitting += self.on_close
data = self.attach_session({})
return data
def attach_session(self, payload):
message_id = uuid4().hex
start_time = time.time()
e = None
try:
print("Sending payload {}".format(payload))
data = self.send_with_response(payload)
assert data['mykey']
except AssertionError as exp:
e = exp
except Exception as exp:
e = exp
self.ws.close()
self.connect()
elapsed = int((time.time() - start_time) * 1000)
if e:
events.request_failure.fire(request_type='sockjs', name='send',
response_time=elapsed, exception=e)
else:
events.request_success.fire(request_type='sockjs', name='send',
response_time=elapsed,
response_length=0)
return data
def send_with_response(self, payload):
json_data = json.dumps(payload)
g = gevent.spawn(self.ws.send, json_data)
g.get(block=True, timeout=2)
g = gevent.spawn(self.ws.recv)
result = g.get(block=True, timeout=10)
json_data = json.loads(result)
return json_data
def on_close(self):
self.ws.close()
class ActionsTaskSet(TaskSet):
@task
def streams(self):
response = self.client.connect()
logger.info("Connect Response: {}".format(response))
class WSUser(Locust):
task_set = ActionsTaskSet
min_wait = 1000
max_wait = 3000
def __init__(self, *args, **kwargs):
super(WSUser, self).__init__(*args, **kwargs)
self.client = SocketClient('wss://mydomain.amazonaws.com/endpoint')
Update 2
I have enabled access logs, the one type of log that wasn't there before. I can now see that my lambdas are always getting a 200 with no issue. The 403 is coming from some MESSAGE
eventType
that doesn't hit an actual routeKey
. Not sure where it comes from, but pretty sure finding that answer will solve this.
I was also able to confirm there are no ENI issues.
An HTTP 403 response code means that a client is forbidden from accessing a valid URL. The server understands the request, but it can't fulfill the request because of client-side issues. The caller isn't authorized to access an API that's using an API Gateway Lambda authorizer.
The HTTP 403 Forbidden error most commonly occurs when private DNS is enabled for an API Gateway interface VPC endpoint that's associated with a VPC. In this scenario, all requests from the VPC to API Gateway APIs resolve to that interface VPC endpoint.
It means the wrong username and password were sent in the request, or that the permissions on the server do not allow what was being asked.
The 403 Forbidden error appears when your server denies you permission to access a page on your site. This is mainly caused by a faulty security plugin, a corrupt . htaccess file, or incorrect file permissions on your server.
You might be running into some VPC-related limits. See https://winterwindsoftware.com/scaling-lambdas-inside-vpc/. Sounds like you might be running out of ENIs. You could try moving the function to a different VPC. How long does each invocation of the lambda run for? And what language is you lambda written in?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With