I'm having trouble in locating the cause for a memory leak in my script. I have a simple repository method which increments a 'count' column in my entity by X amount:
public function incrementCount($id, $amount)
{
$query = $this
->createQueryBuilder('e')
->update('MyEntity', 'e')
->set('e.count', 'e.count + :amount')
->where('e.id = :id')
->setParameter('id', $id)
->setParameter('amount', $amount)
->getQuery();
$query->execute();
}
Problem is, if I call this in a loop the memory usage balloons on every iteration:
$entityManager = $this->getContainer()->get('doctrine')->getManager();
$myRepository = $entityManager->getRepository(MyEntity::class);
while (true) {
$myRepository->incrementCount("123", 5);
$doctrineManager->clear();
gc_collect_cycles();
}
What am I missing here? I've tried ->clear()
, as per Doctrine's advice on batch processing. I even tried gc_collect_cycles()
, but still the issue remains.
I'm running Doctrine 2.4.6 on PHP 5.5.
I just ran into the same issue, these are the things that fixed it for me:
As the OP mentioned in their answer, setting --no-debug
(ex: php bin/console <my_command> --no-debug
) is crucial for performance/memory in Symfony console commands. This is especially true when using Doctrine, as without it, Doctrine will go into debug mode which consumes a huge amount of additional memory (that increases on each iteration). See the Symfony docs here and here for more info.
You should also always specify the environment. By default, Symfony uses the dev
environment for console commands. The dev
environment usually isn't optimized for memory, speed, cpu etc. If you want to iterate over thousands of items, you should probably be using the prod
environment (ex: php bin/console <my_command> --env prod
). See here and here for more info.
Tip: I created an environment called console
that I specifically configured for running console commands. Here is info about how to create additional Symfony environments.
If running a big update, you should probably choose how much memory is acceptable for it to consume. This is especially important if you think there might be a leak. You can specify the memory for the Command by using php -d memory_limit=x
(ex: php -d memory_limit=256M
). Note: you can set the limit to -1
(usually the default for the php cli) to let the command run with no memory limit but this is obviously dangerous.
A well formed console command for running a big update using the above tips would look like:
php -d memory_limit=256M bin/console <acme>:<your_command> --env=prod --no-debug
Another huge one when using Doctrine's ORM in a loop, is to use Doctrine's IterableResult (see the Doctrine Batch Processing docs). This won't help in the example provided but usually when doing processing like this it is over results from a query.
If part of what you are doing is making changes to the data, you should flush periodically instead of on each iteration. Flushing is expensive and slow. The less often you flush, the faster your command will finish. Keep in mind, however, that Doctrine will hold the unflushed data in memory. So the less often that you flush, the more memory you will need.
You can use something like the following to flush every 100 iterations:
if ($count % 100 === 0) {
$this->em->flush();
}
Also make sure to flush again at the end of your loop (for flushing the last < 100 entries).
You may also want to clear after you flush:
$this->em->flush();
$em->clear(); // Detach ALL objects from Doctrine.
Or
$this->em->flush();
$em->clear(MyEntity::class); // Detach all MyEntity from Doctrine.
$em->clear(MyRelatedEntity::class); // Detach all MyRelatedEntity from Doctrine.
It can be really helpful to keep track of how much memory your command is consuming while it is running. You can do that by outputting the response returned by PHP's built-in memory_get_usage() function.
$output->writeln(memory_get_usage());
$memUse = round(memory_get_usage() / 1000000, 2).'MB';
$this->output->writeln('Processed '.$i.' of '.$totalCount.' (mem: '.$memUse.')');
It may also be helpful to roll your own batches. You can do this by using a start and limit just like you would for pagination. I was able to process 4 millions rows using only 90Mb of RAM doing this.
Here's some example code:
protected function execute(InputInterface $input, OutputInterface $output) {
/* ... */
$totalCount = $this->getTotalCount();
$batchSize = 10000;
$i = 0;
while ($i < $totalCount) {
$i = $this->processBatch($i, $batchSize, $totalCount);
}
/* ... */
}
private function processBatch(int $start, int $limit, int $totalCount): int {
/* @var $q \Doctrine\ORM\Query */
$q = $this->em->createQueryBuilder()
->select('e')
->from('AcmeExampleBundle:MyEntity', 'e')
->setFirstResult($start)
->setMaxResults($limit)
->getQuery();
/* @var $iterableResult \Doctrine\ORM\Internal\Hydration\IterableResult */
$iterableResult = $q->iterate(null, \Doctrine\ORM\Query::HYDRATE_SIMPLEOBJECT);
$i = $start;
foreach ($iterableResult as $row) {
/* @var $myEntity \App\Entity\MyEntity */
$myEntity = $row[0];
$this->processOne($myEntity);
if (0 === ($i % 1000)) {
$memUse = round(memory_get_usage() / 1000000, 2).'MB';
$this->output->writeln('Processed '.$i.' of '.$totalCount.' (mem: '.$memUse.')');
}
$this->em->detach($row[0]);
$i++;
}
return $i;
}
private function processOne(MyEntity $myEntity): void {
// Do entity processing here.
}
private function getTotalCount(): int {
/* @var $q \Doctrine\ORM\Query */
$q = $this->em
->createQueryBuilder()
->select('COUNT(e.id)')
->from('AcmeExampleBundle:MyEntity', 'e')
->getQuery();
$count = $q->getSingleScalarResult();
return $count;
}
Good luck!
I resolved this by adding --no-debug
to my command. It turns out that in debug mode, the profiler was storing information about every single query in memory.
Doctrine keeps logs of any query you make. If you make lots of queries (normally happens in loops) Doctrine can cause a huge memory leak.
You need to disable the Doctrine SQL Logger to overcome this.
I recommend doing this only for the loop part.
Before loop, get current logger:
$sqlLogger = $em->getConnection()->getConfiguration()->getSQLLogger();
And then disable the SQL Logger:
$em->getConnection()->getConfiguration()->setSQLLogger(null);
Do loop here:
foreach() / while() / for()
After loop ends, put back the Logger:
$em->getConnection()->getConfiguration()->setSQLLogger($sqlLogger);
For me it was clearing doctrine, or as the documentation says, detaching all entities:
$this->em->clear(); //Here em is the entity manager.
So inside my loop y flush every 1000 iterations and detach all entities (I don't need them anymore):
foreach ($reader->getRecords() as $position => $value) {
$this->processValue($value, $position);
if($position % 1000 === 0){
$this->em->flush();
$this->em->clear();
}
$this->progress->advance();
}
Hope this helps.
PS: here's the documentation.
You're wasting memory for each iteration. A much better way would be to prepare the query once and swap arguments many times. For example:
class MyEntity extends EntityRepository{
private $updateQuery = NULL;
public function incrementCount($id, $ammount)
{
if ( $this->updateQuery == NULL ){
$this->updateQuery = $this->createQueryBuilder('e')
->update('MyEntity', 'e')
->set('e.count', 'e.count + :amount')
->where('e.id = :id')
->getQuery();
}
$this->updateQuery->setParameter('id', $id)
->setParameter('amount', $amount);
->execute();
}
}
As you mentioned, you can employ batch processing here, but try this out first and see how well (if at all) performs...
I had similar issues with a memory leak. I'm running Doctrine in a Symfony 5.2 project. More specific, I built a never-ending Command which is processing entries from one table, retrieves entries from another table, and creates 2 new entries in other tables. (Event Processing)
I solved my leakage problems in two steps.
--no-debug
when running the command (as already suggested by Jonathan)$this->entityManager->clear();
In order to see and identify the leakages, I used the following line to output the current memory usage:
$output->writeln('Memory Usage in MB: ' . memory_get_usage() / 1024 / 1024);
Maybe this helps anyone still fighting with leakages.
I encountered the same issue and disabling the query cache helped me.
$query = $this
->createQueryBuilder('e')
->update('MyEntity', 'e')
->set('e.count', 'e.count + :amount')
->where('e.id = :id')
->setParameter('id', $id)
->setParameter('amount', $amount)
->getQuery()
->useQueryCache(false); // <-- this line
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With