I am surprised I was not able to find more on this, but alas, I still cannot find the answer. We recently converted to AWS, moving our simple website to a more robust and reliable system. What is currently baffling me is managing cron jobs on the distributed system, when that cron job gets pushed to every instance in the environment.
Here's the use case:
We are running a traditional LAMP stack. Probably the first problem, but it's what we got.
table1
- id int(11)
- start date
- interval int(11) (number of seconds)
table2
- id int(11)
- table1_id int(11)
- sent datetime
The goal is that a script will run once every day and check the following:
table1.start
table1.start
< current datetable1.interval
> 0table2
such that table2.sent
is today and table2.table1_id
matches the previous checks.If all these checks pass, we insert an entry into table2 for each table1 that has the interval. This also means we send an email based on the data in table2.
Essentially, we have two queries, represented by the aforementioned blocks. The issue is that on a distributed system, each instance will run cron at the same time (or within milliseconds of each other). There is no notion of a "transaction," so each instance will send an email if one doesn't get a chance to insert into table2
before the others run the first query.
I have done a fair amount of research on this, but the only potential solutions I have come up with are detailed below:
Set up a single, independent instance responsible for running cron jobs. While this will most certainly (as far as I can see) work, it is very costly for a job that is not terribly expensive and only needs to run once a day, at most.
Set cron to regularly run a PHP script that acts as a scheduler. This was the route we were going down after the research suggested it would be the simplest for our limited time and money. The problem that I ran into was that this just seemed to shift the concurrency problem from consuming jobs to scheduling jobs. When do you schedule the jobs such that multiple jobs aren't scheduled at the same time from each instance running the cron?
This method also seems very "kludgy" (to borrow a favorite word of my friend), and I would have to agree.
Although I have researched this quite a bit, concurrency was always solved with atomic transactions on the database, but so far as I can tell, this isn't easy to achieve with LAMP. But perhaps I am wrong, and I would be very happy to be proven so.
So if anyone can help me figure this one out, I would greatly appreciate it. Perhaps my Googling skills are getting rusty, but I cannot imagine I am the only one suffering from this (probably simple) task.
cron is a Chef resource that represents a cron job. When AWS OpsWorks Stacks runs the recipe on an instance, the associated provider handles the details of setting up the job. job_name is a user-defined name for the cron job, such as weekly report . hour / minute / weekday specify when the commands should run.
Dkron is a system service for workload automation that runs scheduled jobs, just like unix cron service but distributed in several machines in a cluster. This is the only job scheduler in the market with truly no SPOF. It is open source and available for free.
I had a similar problem. And I also had cron jobs that had to run every minute, but on a single host only
I solved it with this hack, which runs the amazon autoscaling tools to find out if the box on which it runs is the last one instantiated in this auto scaling group. This obviously assumes you use autoscaling, and that the hostname contains the instance ID.
#!/usr/bin/env ruby
AWS_AUTO_SCALING_HOME='/opt/AutoScaling'
AWS_AUTO_SCALING_URL='https://autoscaling.eu-west-1.amazonaws.com'
MY_GROUP = 'Production'
@cmd_out = `bash -c 'AWS_AUTO_SCALING_HOME=#{ AWS_AUTO_SCALING_HOME }\
AWS_AUTO_SCALING_URL=#{ AWS_AUTO_SCALING_URL }\
#{ AWS_AUTO_SCALING_HOME }/bin/as-describe-auto-scaling-instances'`
raise "Output empty, should not happen!" if @cmd_out.empty?
@lines = @cmd_out.split(/\r?\n/)
@last = @lines.select {|l| l.match MY_GROUP }.reverse.
detect { |l| l =~ /^INSTANCE\s+\S+\s+\S+\s+\S+\s+InService\s+HEALTHY/ }
raise "No suitable host in autoscaling group!" unless @last
@last_host = @last.match(/^INSTANCE\s+(\S+)/)[1]
@hostname = `hostname`
if @hostname.index(@last_host)
puts "It's me!"
exit(0)
else
puts "Someone else will do it!"
exit(1)
end
Saved it as /usr/bin/lastonly, and then in cron jobs I do:
lastonly && do_my_stuff
Clearly it's not perfect, but it works for me, and it's simple!
Take a look at the Gearman project http://www.gearman.org. The basic architecture is you'll have one machine that's a job server and all the other machines become clients of the server.
You can setup the crontab on the job server to send commands to execute to all of the clients connected through Gearman. You can then use PHP to slice and dice your cron jobs and get as deep into Map/Reduce as you want.
Here's a good tutorial on the concepts and how it works: http://www.lornajane.net/posts/2011/Using-Gearman-from-PHP
Don't get disheartened about working with something like Gearman right away. Distributed cron systems can be complex, but once you get your head around it you'll be ok.
FWIW, we process thousands of cron scripts every minute amongst a Gearman worker farm on Amazon's EC2. We absolutely love it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With