Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can AWS step function executes more than 25000 times?

I am currently evaluating AWS state machine that can process single document. The state machine would take 5-10 mins to process a single document.

{
  "Comment":"Process document",
  "StartAt": "InitialState",
  "States": {
          //the document goes through multiple states here
  }
}

The C# code invokes the state machine by passing some json for each document. Something like

      // max 100 documents
      public Task Process(IEnumerable<Document> documents)
      {   
          var amazonStepFunctionsConfig = new AmazonStepFunctionsConfig { RegionEndpoint = RegionEndpoint.USWest2 };
          using (var amazonStepFunctionsClient = new AmazonStepFunctionsClient(awsAccessKeyId, awsSecretAccessKey, amazonStepFunctionsConfig))
          {
            foreach(var document in documents)
            {
                var jsonData1 = JsonConvert.SerializeObject(document);
                var startExecutionRequest = new StartExecutionRequest
                {
                  Input = jsonData1,
                  Name = document.Id, 
                  StateMachineArn = "arn:aws:states:us-west-2:<SomeNumber>:stateMachine:ProcessDocument"
                };
                var taskStartExecutionResponse = await amazonStepFunctionsClient.StartExecutionAsync(startExecutionRequest);                
            }
          }
      }

We process the documents in batch of 100. So in above loop the max number of documents will be 100. However we process thousands of documents weekly (25000+).

As per the AWS documentation Maximum execution history size is 25,000 events. If the execution history reaches this limit the execution will fail.

Does that mean we can not execute a single state machine more than 25000 times? Why execution of state machine should depend on its history, why cant AWS just purge history?

I know there is a way to continue as new execution but I am just trying to understand the history limit and its relation to state machine execution, and is my understanding is correct?

Update 1
I don't think this is duplicate question. I am trying find if my understanding of history limit is correct? Why history has anything to do with number of times state machine can execute? When state machine executes, it creates history record, if history records goes more 25000+, then purge them or archive them. Why would AWS stop execution of state machine. That does not make sense.

So question, Can single state machine (unique arn) execute more than 25000+ times in loop? if i have to create new state machine (after 25000 executions) wouldn't that state machine will have different arn?

Also if i had to follow linked SO post where would i get current number of executions? Also he is looping with-in the step function, while i am calling step function with-in the loop

Update 2
So just for testing i created the following state machine

{
  "StartAt": "HelloWorld",
  "States": {
    "HelloWorld": {
      "Type": "Pass",
      "Result": "Hello World!",
      "End": true
    }
  }
}

and executed it 26000 times with NO failure

    public static async Task Main(string[] args)
    {
        AmazonStepFunctionsClient client = new AmazonStepFunctionsClient("my key", "my secret key", Amazon.RegionEndpoint.USWest2);
        for (int i = 1; i <= 26000; i++)
        {
            var startExecutionRequest = new StartExecutionRequest
            {
                Input = JsonConvert.SerializeObject(new { }),
                Name = i.ToString(),
                StateMachineArn = "arn:aws:states:us-west-2:xxxxx:stateMachine:MySimpleStateMachine"
            };

            var response = await client.StartExecutionAsync(startExecutionRequest);
        }

        Console.WriteLine("Press any key to continue");
        Console.ReadKey();
    }

and on AWS Console i am able to pull the history for all 26000 executions enter image description here

So i am not sure exactly what does it mean by Maximum execution history size is 25,000 events

like image 573
LP13 Avatar asked Jan 18 '19 18:01

LP13


People also ask

How long can an AWS step function run?

For long-running queries requiring multi-step processing, utilize Step Functions to orchestrate the tasks by using Asynchronous Express Workflows. They can also run for up to five minutes.

How many Step Functions can run at once?

Step Functions Limits, September 2020 Time to check AWS's service quotas. Step Functions is engineered for limits of 300 new executions per second in N. Virginia, Oregon, and Ireland and 150 per second in all other regions.

What is the maximum state transition rate when using Express workflows with AWS Step Functions?

The new AWS Step Functions Express Workflows type uses fast, in-memory processing for high-event-rate workloads of up to 100,000 state transitions per second, for a total workflow duration of up to 5 minutes.

What is execution history in step function?

PDF. Returns the history of the specified execution as a list of events. By default, the results are returned in ascending order of the timeStamp of the events. Use the reverseOrder parameter to get the latest events first.


2 Answers

I don't think you've got it right. 25,000 limit is for a State Machine execution history. You have tested 26,000 State Machine executions. State Machine executions limit is 1,000,000 open executions.

A State Machine can run for up to 1 year, and during this time its execution history should not reach more than 25,000.

Hope it helps.

like image 185
A.Khan Avatar answered Nov 12 '22 23:11

A.Khan


The term "Execution History" is used to describe 2 completely different things in the quota docs, which has caused your confusion (and mine until I realized this):

  • 90 day quota on execution history retention: This is the history of all executions, as you'd expect
  • 25,000 quota on execution history size: This is the history of "state events" within 1 execution, NOT across all executions in history. In other words, if your single execution runs through thousands of steps, thereby racking up 25k events (likely because of a looping structure in the workflow), it will suddenly fail and exit.

As long as your executions complete in under 25k steps each, you can run the state machine much more than 25k times sequentially without issue :)

like image 31
lance.dolan Avatar answered Nov 13 '22 00:11

lance.dolan