This is my requirement: I have a crawler and a pyspark job in AWS Glue. I have to setup the workflow using step function.
Questions:
References:
A few months late to answer this but this can be done from within the step function. You can create the following states to achieve it:
TriggerCrawler
: Task State: Triggers a Lambda function, within this lambda function you can write code for triggering AWS Glue Crawler using any of the aws-sdkPollCrawlerStatus
: Task state: Lambda function that polls for Crawler status and returns it as a response of lambda.IsCrawlerRunSuccessful
: Choice State: Based on that status of Glue crawler you can make Next state to be a Choice state which will either go to the next state that triggers yours Glue job (once the Glue crawler state is 'READY') or go to the Wait State
for few seconds before you poll for it again.RunGlueJob
: Task State: A Lambda function that triggers the glue job.WaitForCrawler
: Wait State: That waits for 'n' seconds before you poll for status again.Finish
: Succeed State.Here is how this Step Function will look like:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With