I'm trying to create a Glue Job that enumerates all tables in a database in my catalog. In order to do so I use the following code snippet:
session = boto3.Session(region_name='us-east-2')
glue = session.client('glue')
tables = glue.get_tables(
DatabaseName='customer1'
)
print tables
The job hangs for about 15 minutes and the connection appears to be refused, because I eventually get the following error:
botocore.vendored.requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='glue.us-east-2.amazonaws.com', port=443): Max retries exceeded with url: / (Caused by ConnectTimeoutError(, 'Connection to glue.us-east-2.amazonaws.com timed out. (connect timeout=60)’))
This issue is specific to the glue API. I can use the S3 API with no problems.
I've gone through all my security groups and opened up all the ports to traffic from anywhere. I've even added self-referencing rules. But this is to no avail.
I can't figure out what could be causing the connection to be blocked. Is AWS specifically blocking glue requests?
Some common reasons why your AWS Glue jobs take a long time to complete are the following: Large datasets. Non-uniform distribution of data in the datasets. Uneven distribution of tasks across the executors.
To create an AWS Glue job, you need to use the create_job() method of the Boto3 client. This method accepts several parameters such as the Name of the job, the Role to be assumed during the job execution, set of commands to run, arguments for those commands, and other parameters related to the job execution.
Yes, it is possible. You can use Amazon Glue to extract data from REST APIs. Although there is no direct connector available for Glue to connect to the internet world, you can set up a VPC, with a public and a private subnet.
I was facing the same problem that boto3 calls to glue
or s3
were hanging and eventually timing out.
I fixed it by changing the subnet-id when creating the dev-endpoint. Initially I was using a subnet that routed traffic to an Internet Gateway. I switched to a subnet routing traffic to an internal NAT gateway. Hope this helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With