Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scheduling long-running tasks using AWS services

My application heavily relies on AWS services, and I am looking for an optimal solution based on them. Web Application triggers a scheduled job (assume repeated infinitely) which requires certain amount of resources to be performed. Single run of the task normally will take maximum 1 min.

Current idea is to pass jobs via SQS and spawn workers on EC2 instances depending on the queue size. (this part is more or less clear) But I struggle to find a proper solution for actually triggering the jobs at certain intervals. Assume we are dealing with 10000 jobs. So for a scheduler to run 10k cronjobs (the job itself is quite simple, just passing job description via SQS) at the same time seems like a crazy idea. So the actual question would be, how to autoscale the scheduler itself (given the scenarios when scheduler is restarted, new instance is created etc. )? Or the scheduler is redundant as an app and it is wiser to rely on AWS Lambda functions (or other services providing scheduling)? The problem with using Lambda functions is the certain limitation and the memory provided 128mb provided by single function is actually too much (20mb seems like more than enough)

Alternatively, the worker itself can wait for a certain amount of time and notify the scheduler that it should trigger the job one more time. Let's say if the frequency is 1 hour:

1. Scheduler sends job to worker 1
2. Worker 1 performs the job and after one hour sends it back to Scheduler
3. Scheduler sends the job again

The issue here however is the possibility of that worker will be get scaled in.

Bottom Line I am trying to achieve a lightweight scheduler which would not require autoscaling and serve as a hub with sole purpose of transmitting job descriptions. And certainly should not get throttled on service restart.

like image 553
Yerken Avatar asked Dec 10 '15 09:12

Yerken


1 Answers

Lambda is perfect for this. You have a lot of short running processes (~1 minute) and Lambda is for short processes (up until five minutes nowadays). It is very important to know that CPU speed is coupled to RAM linearly. A 1GB Lambda function is equivalent to a t2.micro instance if I recall correctly, and 1.5GB RAM means 1.5x more CPU speed. The cost of these functions is so low that you can just execute this. The 128MB RAM has 1/8 CPU speed of a micro instance so I do not recommend using those actually.

As a queueing mechanism you can use S3 (yes you read that right). Create a bucket and let the Lambda worker trigger when an object is created. When you want to schedule a job, put a file inside the bucket. Lambda starts and processes it immediately.

Now you have to respect some limits. This way you can only have 100 workers at the same time (the total amount of active Lambda instances), but you can ask AWS to increase this.

The costs are as follows:

  • 0.005 per 1000 PUT requests, so $5 per million job requests (this is more expensive than SQS).
  • The Lambda runtime. Assuming normal t2.micro CPU speed (1GB RAM), this costs $0.0001 per job (60 seconds, first 300.000 seconds are free = 5000 jobs)
  • The Lambda requests. $0.20 per million triggers (first million is free)

This setup does not require any servers on your part. This cannot go down (only if AWS itself does).

(don't forget to delete the job out of S3 when you're done)

like image 113
Luc Hendriks Avatar answered Nov 15 '22 03:11

Luc Hendriks