Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Running a PHP app on windows - daemon or cron?

I need some implementation advice. I have a MYSQL DB that will be written to remotely for tasks to process locally and I need my application which is written in PHP to execute these tasks imediatly as they come in.

But of course my PHP app needs to be told when to run. I thought about using cron jobs but my app is on a windows machine. Secondly, I need to be constantly checking every few seconds and cron can only do every minute.

I thought of writing a PHP daemon but I am getting consued on hows its going to work and if its even a good idea!

I would appreciate any advice on the best way to do this.

like image 417
Kay Avatar asked Apr 22 '11 17:04

Kay


4 Answers

pyCron is a good CRON alternative for Windows:

pyCron

Since this task is quite simple I would just set up pyCron to run the following script every minute:

set_time_limit(60); // one minute, same as CRON ;)
ignore_user_abort(false); // you might wanna set this to true

while (true)
{
    $jobs = getPendingJobs();

    if ((is_array($jobs) === true) && (count($jobs) > 0))
    {
        foreach ($jobs as $job)
        {
            if (executeJob($job) === true)
            {
                markCompleted($job);
            }
        }
    }

    sleep(1); // avoid eating unnecessary CPU cycles
}

This way, if the computer goes down, you'll have a worst case delay of 60 seconds.

You might also want to look into semaphores or some kind of locking strategy like using an APC variable or checking for the existence of a locking file to avoid race conditions, using APC for example:

set_time_limit(60); // one minute, same as CRON ;)
ignore_user_abort(false); // you might wanna set this to true

if (apc_exists('lock') === false) // not locked
{
    apc_add('lock', true, 60); // lock with a ttl of 60 secs, same as set_time_limit

    while (true)
    {
        $jobs = getPendingJobs();

        if ((is_array($jobs) === true) && (count($jobs) > 0))
        {
            foreach ($jobs as $job)
            {
                if (executeJob($job) === true)
                {
                    markCompleted($job);
                }
            }
        }

        sleep(1); // avoid eating unnecessary CPU cycles
    }
}

If you're sticking with the PHP daemon do yourself a favor and drop that idea, use Gearman instead.

EDIT: I asked a related question once that might interest you: Anatomy of a Distributed System in PHP.

like image 58
Alix Axel Avatar answered Sep 22 '22 01:09

Alix Axel


I'll suggest something out of the ordinary: you said you need to run the task at the point the data is written to MySQL. That implies MySQL "knows" something should be executed. It sounds like perfect scenario for MySQL's UDF sys_exec.

Basically, it would be nice if MySQL could invoke an external program once something happened to it. If you use the mentioned UDF, you can execute a php script from within - let's say, INSERT or UPDATE trigger. On the other hand, you can make it more resource-friendly and create MySQL Event (assuming you're using appropriate version) that would use sys_exec to invoke a PHP script that does certain updates at predefined intervals - that reduces the need for Cron or any similar program that can execute something at predefined intervals.

like image 23
Michael J.V. Avatar answered Sep 22 '22 01:09

Michael J.V.


i would definately not advise to use cronjobs for this.

cronjobs are a good thing and very useful and easy for many purposes, but as you describe your needs, i think they can produce more complications than they do good. here are some things to consider:

  • what happens if jobs overlap? one takes longer to execute than one minute? are there any shared resources/deadlocks/tempfiles? - the most common method is to use a lock file, and stop the execution if its occupied right at the start of the program. but the program also has to look for further jobs right before it completes. - this however can also get complicated on windows machines because they AFAIK don't support write locks out of the box

  • cronjobs are a pain in the ass to maintain. if you want to monitor them you have to implement additional logic like a check when the program last ran. this however can get difficult if your program should run only on demand. the best way would be some sort of "job completed" field in the database or delete rows that have been processed.

  • on most unix based systems cronjobs are pretty stable now, but there are a lot of situatinos where you can break your cronjob system. most of them are based on human error. for example a sysadmin not exiting the crontab editor properly in edit mode can cause all cronjobs to be deleted. a lot of companies also have no proper monitoring system for the reasons stated above and notice as soon as their services experience problems. at this point often nobody has written down/put under version control which cronjobs should run and wild guessing and reconstruction work begins.

  • cronjob maintaince can be further complicated when external tools are used and the environment is not a native unix system. sysadmins have to gain knowledge of more programs and they can have potential errors.

i honestly think just a small script that you start from the console and let open is just fine.

<?php
while(true) {
 $job = fetch_from_db();
 if(!$job) { 
    sleep(10) 
 } else {
    $job->process();
 }
}

you can also touch a file (modify modification timestamp) in every loop, and you can write a nagios script that checks for that timestamp getting out of date so you know that your job is still running...

if you want it to start up with the system i recommend a deamon.

ps: in the company i work there is a lot of background activity for our website (crawling, update processes, calculations etc...) and the cronjobs were a real mess when i started there. their were spread over different servers responsible for different tasks. databases were accessed wildly accross the internet. a ton of nfs filesytems, samba shares etc were in place to share resouces. the place was full of single points of failures, bottlenecks and something constantly broke. there were so many technologies involved that it was very difficult to maintain and when something didnt work it needed hours of tracking down the problem and another hour of what that part even was supposed to do.

now we have one unified update program that is responsible for literally everyhing, it runs on several servers and they have a config file that defines the jobs to run. eveyrthing gets dispatched from one parent process doing an infinite loop. its easy to monitor, customice, synchronice and everything runs smoothly. it is redundant, it is syncrhonized and the granularity is fine. so it runs parallel and we can scale up to as many servers as we like.

i really suggest to sit down for enough time and think about everything as a whole and get a picture of the complete system. then invest the time and effort to implement a solution that will serve fine in future and doesnt spread tons of different programs throughout your system.

pps:

i read a lot about the minimum interaval of 1/5 minutes for cronjobs/tasks. you can easily work around that with an arbitrary script that takes over that interval:

// run every 5 minutes = 300 secs
// desired interval: 30 secs
$runs = 300/30; // be aware that the parent interval needs to be a multiple of the desired interval
for($i=0;$i<$runs;$i++) {
 $start = time();
 system('myscript.php');
 sleep(300/10-time()+$start); // compensate the time that the script needed to run. be aware that you have to implement some logic to deal with cases where the script takes longer to run than your interavl - technique and problem described above
}
like image 24
The Surrican Avatar answered Sep 22 '22 01:09

The Surrican


This looks like a job for a job server ;) Have a look at Gearman. The additional benefit of this approach is, that this is triggered by the remote side, when and only then there is something to do, instead of polling. Especially in intervals smaller than (lets say) 5 min polling is not very effective any more, depending on the tasks the job performs.

like image 21
KingCrunch Avatar answered Sep 19 '22 01:09

KingCrunch