After reading Amazon docs, my understanding is that the only way to run/test a Glue script is to deploy it to a dev endpoint and debug remotely if necessary. At the same time, if the (Python) code consists of multiple files and packages, all except the main script need to be zipped. All this gives me the feeling that Glue is not suitable for any complex ETL task as development and testing is cumbersome. I could test my Spark code locally without having to upload the code to S3 every time, and verify the tests on a CI server without having to pay for a development Glue endpoint.
To test an AWS Glue connectionSign in to the AWS Management Console and open the AWS Glue console at https://console.aws.amazon.com/glue/ . In the navigation pane, under Data catalog, choose Connections. Select the check box next to the desired connection, and then choose Test connection.
Eventually, as of Aug 28, 2019, Amazon allows you to download the binaries and
develop, compile, debug, and single-step Glue ETL scripts and complex Spark applications in Scala and Python locally.
Check out this link: https://aws.amazon.com/about-aws/whats-new/2019/08/aws-glue-releases-binaries-of-glue-etl-libraries-for-glue-jobs/
I spoke to an AWS sales engineer and they said no, you can only test Glue code by running a Glue transform (in the cloud). He mentioned that there were testing out something called Outpost to allow on-prem operations, but that it wasn't publically available yet. So this seems like a solid "no" which is a shame because it otherwise seems pretty nice. But with out unit tests, its no-go for me.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With