I want to write a worker for beanstalkd in php, using a Zend Framework 2 controller. It starts via the CLI and will run forever, asking for jobs from beanstalkd like this example.
In simple pseudo-like code:
while (true) {
$data = $beanstalk->reserve();
$class = $data->class;
$params = $data->params;
$job = new $class($params);
$job();
}
The $job
has here an __invoke()
method of course. However, some things in these jobs might be running for a long time. Some might run with a considerable amount of memory. Some might have injected the $beanstalk
object, to start new jobs themselves, or have a Zend\Di\Locator
instance to pull objects from the DIC.
I am worried about this setup for production environments on the long term, as perhaps circular references might occur and (at this moment) I do not explicitly "do" any garbage collection while this action might run for weeks/months/years *.
*) In beanstalk, reserve
is a blocking call and if no job is available, this worker will wait until it gets any response back from beanstalk.
My question: how will php handle this on the long term and should I take any special precaution to keep this from blocking?
This I did consider and might be helpful (but please correct if I am wrong and add more if possible):
$job
in every iteration__destruct()
from a $job
(NB: Update from here)
I did run some tests with arbitrary jobs. The jobs I included were: "simple", just set a value; "longarray", create an array of 1,000 values; "producer", let the loop inject $pheanstalk
and add three simplejobs to the queue (so there is now a reference from job to beanstalk); "locatoraware", where a Zend\Di\Locator
is given and all job types are instantiated (though not invoked). I added 10,000 jobs to the queue, then I reserved all jobs in a queue.
Results for "simplejob" (memory consumption per 1,000 jobs, with memory_get_usage()
)
0: 56392
1000: 548832
2000: 1074464
3000: 1538656
4000: 2125728
5000: 2598112
6000: 3054112
7000: 3510112
8000: 4228256
9000: 4717024
10000: 5173024
Picking a random job, measuring the same as above. Distribution:
["Producer"] => int(2431)
["LongArray"] => int(2588)
["LocatorAware"] => int(2526)
["Simple"] => int(2456)
Memory:
0: 66164
1000: 810056
2000: 1569452
3000: 2258036
4000: 3083032
5000: 3791256
6000: 4480028
7000: 5163884
8000: 6107812
9000: 6824320
10000: 7518020
The execution code from above is updated to this:
$baseMemory = memory_get_usage();
gc_enable();
for ( $i = 0; $i <= 10000; $i++ ) {
$data = $bheanstalk->reserve();
$class = $data->class;
$params = $data->params;
$job = new $class($params);
$job();
$job = null;
unset($job);
if ( $i % 1000 === 0 ) {
gc_collect_cycles();
echo sprintf( '%8d: ', $i ), memory_get_usage() - $baseMemory, "<br>";
}
}
As everybody notices, the memory consumption is in php not leveraged and kept to a minimum, but increases over time.
A typical appropriate memory limit for PHP running Drupal is 128MB per process; for sites with a lot of contributed modules or with high-memory pages, 256MB or even 512MB may be more appropriate. It's often the case that the admin section of a site, or a particular page, uses much more memory than other pages.
Memory leaks can happen in any language, including PHP. These memory leaks may happen in small increments that take time to accumulate, or in larger jumps that manifest quickly.
PHP memory management functions are invoked by the MySQL Native Driver through a lightweight wrapper. Among others, the wrapper makes debugging easier. The various MySQL Server and the various client APIs differentiate between buffered and unbuffered result sets.
I've usually restarted the script regularly - though you don't have to do it after every job is run (unless you want to, and it's useful to clear memory). You could for example run for up to 100 jobs or more at a time or till the script had used say 20MB RAM, and then exit the script, to be instantly re-run.
My blogpost at http://www.phpscaling.com/2009/06/23/doing-the-work-elsewhere-sidebar-running-the-worker/ has some example shell scripts of re-running the scripts.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With