Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Google App Engine - Task Queues vs Cron Jobs

The latest Google App Engine release supports a new Task Queue API in Python. I was comparing the capabilities of this API vs the already existing Cron service. For background jobs that are not user-initiated, such as grabbing an RSS feed and parsing it on a daily interval. Can and should the Task Queue API be used for non-user initiated requests such as this?

like image 506
mshafrir Avatar asked Jun 22 '09 15:06

mshafrir


People also ask

Is task queue asynchronous?

Task queue (or) Job queue plays an important role in a web application where it schedules programs (or) jobs to the queue for batch processing. This enables services (or) programs to execute in the background without disturbing other processes. This Task Queue is designed for asynchronous work.

What is task queue in GCP?

Task queues let applications perform work, called tasks, asynchronously outside of a user request. If an app needs to execute work in the background, it adds tasks to task queues. The tasks are executed later, by worker services. The Task Queue service is designed for asynchronous work.

Are cron jobs reliable?

Several aspects of the cron service are notable from a reliability perspective: Cron's failure domain is essentially just one machine. If the machine is not running, neither the cron scheduler nor the jobs it launches can run.

What is Cron in Yaml?

Cron jobs are scheduled on reoccurring intervals and are specified using a simple English-like format. You can define a schedule so that your job runs multiple times a day, or runs on specific days and months.


1 Answers

I'd say "sort of". The things to remember about task queues are:

1) a limit of operations per minute/hour/day is not the same as repeating something at regular intervals. Even with the token bucket size set to 1, I don't think you're guaranteed that those repetitions will be evenly spaced. It depends how serious they are when they say the queue is implemented as a token bucket, and whether that statement is supposed to be a guaranteed part of the interface. This being labs, nothing is guaranteed yet.

2) if a task fails then it's requeued. If a cron job fails, then it's logged and not retried until it's due again. So a cron job doesn't behave the same way either as a task which adds a copy of itself and then refreshes your feed, or as a task which refreshes your feed and then adds a copy of itself.

It may well be possible to mock up cron jobs using tasks, but I doubt it's worth it. If you're trying to work around a cron job which takes more than 30 seconds to run (or hits any other request limit), then you can split the work up into pieces, and have a cron job which adds all the pieces to a task queue. There was some talk (in the GAE blog?) about asynchronous urlfetch, which might be the ultimate best way of updating RSS feeds.

like image 78
2 revs Avatar answered Oct 03 '22 22:10

2 revs