I have a php script that uses Doctrine2 and Zend to calculate some things from a database and send some emails for 30.000 users.
My script is leaking memory and I want to know which are the objects that are consuming that memory, and if it is possible who is keeping a reference to them (thus not allowing them to be released).
Im using php 5.3.x, so plain circular references shouldn't be the problem.
Ive tried using xdebug trace capabilities to get mem_delta with no success (too much data).
Ive tried manually adding memory_get_usage before and after the important functions. But the only conclusion that I got was that I loose around 400k per user, and 3000 users times that gives me the 1Gb that i have available.
Are there any other ways to know where and why memory is leaking? Thanks
You could try sending say 10 emails and then inserting this
get_defined_vars();
http://nz.php.net/manual/en/function.get-defined-vars.php
At the end of the script or after the email is sent (depending on how your code is setup).
This should tell you what is still loaded, and what you can unset / turn into a reference.
Also if there are two many things loaded you get this near start and end of your code and work out the difference.
30.000 objects to hydrate is quite a lot. Doctrine 2 is stable, but there are some bugs, so I am not too surprised about your memory leak problems.
Although with smaller data sets I had some good success using doctrines batch processing capabilities and creating an iterable result.
You can use the code from the examples, and add a gc_collect_cycles()
after each iteration. You have to test it, but for me batch sizes around 100 or so worked quite good – that number gave a good balance between performance and memory usage.
It´s quite important that the script recognizes which entities where processed so that it can be restarted without any problems and resume normal operation without sending emails twice.
$batchSize = 20;
$i = 0;
$q = $em->createQuery('select u from MyProject\Model\User u');
$iterableResult = $q->iterate();
while (($row = $iterableResult->next()) !== false) {
$entity = $row[0];
// do stuff with $entity here
// mark entity as processed
if (($i % $batchSize) == 0) {
$em->flush();
$em->clear();
gc_collect_cycles();
}
++$i;
}
Anyhow, maybe you should rethink your architecture for that script a bit, as a ORM is not well suited for processing large chunks of data. Maybe you can get away with working on the raw SQL rows?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With