I have a use case where I have a AWS Step function that is triggered when a file is uploaded to S3, from there the first step runs an ffprobe to get the duration of the file from an external service such as transloadit where the output is written back to S3.
I can create a new step function from that event, but I was wandering if it is possible to have an Await promise inside the original step function and then continue to the next - taking into account that it could take longer for the ffprobe to comeback.
Any advice is much appreciated on how to tackle this.
Waiting with Step Functions. One of the understated superpowers of Step Functions is the wait state. It allows you to pause a workflow for up to an entire year!
90 days. After this time, you can no longer retrieve or view the execution history. There is no further quota for the number of closed executions that Step Functions retains. To see execution history, Amazon CloudWatch Logs logging must be configured.
Without an explicit timeout, Step Functions often relies solely on a response from an activity worker to know that a task is complete. If something goes wrong and the TimeoutSeconds field isn't specified for an Activity or Task state, an execution is stuck waiting for a response that will never come.
AWS Step Functions now supports asynchronous callbacks for long-running steps as first-class.
This is similar to @mixja's answer above but simplified. A single state in your workflow can directly invoke Lambda, SNS, SQS, or ECS and wait for a call to SendTaskSuccess
.
There is a good example documented for SQS, where a step function sends a message and pauses workflow execution until something provides a callback. Lambda would be equivalent (assuming the main processing like transloadit happens outside the Lambda itself)
Your step function definition would look like
"Invoke transloadit": { "Type": "Task", "Resource": "arn:aws:states:::lambda:invoke.waitForTaskToken", "Parameters": { "FunctionName": "InvokeTransloadit", "Payload": { "some_other_param": "...", "token.$": "$$.Task.Token" } }, "Next": "NEXT_STATE" }
Then in your Lambda you would do something like
def lambda_handler(event, context): token = event['token'] # invoke transloadit via SSM, ECS, passing token along
then in your main long-running process you would issue a callback with the token like aws stepfunctions send-task-success --task-token $token
from shell script / CLI, or similar with API calls.
When you send the request to transloadit, save the taskToken for the step in s3 at a predictable key based on the uploaded file key. For example, if the media file is at 's3://my-media-bucket/foobar/media-001.mp3', you could make a JSON file containing the task token of the current step and store it with the same key in a different bucket, for example 's3://ffprobe-tasks/foobar/media-001.mp3.json'. At the end of your step that sends the media to transloadit do not call success or failure on the step -- leave it running.
Then when you get s3 notification that the transloadit result is ready, you can determine the s3 key to get the task token ('s3://ffprobe-tasks/foobar/media-001.json'), load the JSON (and delete it from s3) and send success for that task. The step function will continue to the next state in the execution.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With