Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I test AWS Glue code locally?

After reading Amazon docs, my understanding is that the only way to run/test a Glue script is to deploy it to a dev endpoint and debug remotely if necessary. At the same time, if the (Python) code consists of multiple files and packages, all except the main script need to be zipped. All this gives me the feeling that Glue is not suitable for any complex ETL task as development and testing is cumbersome. I could test my Spark code locally without having to upload the code to S3 every time, and verify the tests on a CI server without having to pay for a development Glue endpoint.

like image 384
lfk Avatar asked Jan 18 '18 05:01

lfk


People also ask

How do I test my AWS Glue?

To test an AWS Glue connectionSign in to the AWS Management Console and open the AWS Glue console at https://console.aws.amazon.com/glue/ . In the navigation pane, under Data catalog, choose Connections. Select the check box next to the desired connection, and then choose Test connection.


Video Answer


2 Answers

Eventually, as of Aug 28, 2019, Amazon allows you to download the binaries and

develop, compile, debug, and single-step Glue ETL scripts and complex Spark applications in Scala and Python locally.

Check out this link: https://aws.amazon.com/about-aws/whats-new/2019/08/aws-glue-releases-binaries-of-glue-etl-libraries-for-glue-jobs/

like image 195
Brian Avatar answered Oct 02 '22 16:10

Brian


I spoke to an AWS sales engineer and they said no, you can only test Glue code by running a Glue transform (in the cloud). He mentioned that there were testing out something called Outpost to allow on-prem operations, but that it wasn't publically available yet. So this seems like a solid "no" which is a shame because it otherwise seems pretty nice. But with out unit tests, its no-go for me.

like image 23
nont Avatar answered Oct 02 '22 15:10

nont