Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Doctrine: Why can't I free memory when accessing entities through an association?

I have an Application that has a relationship to ApplicationFile:

/**
 * @ORM\OneToMany(
 *   targetEntity="AppBundle\Entity\ApplicationFile",
 *   mappedBy="application",
 *   cascade={"remove"},
 *   orphanRemoval=true
 * )
 */
private $files;

A file entity has a field that stores binary data, and can be up to 2MB in size. When iterating over a large list of applications and their files, PHP memory usage grows. I want to keep it down.

I've tried this:

$applications = $this->em->getRepository('AppBundle:Application')->findAll();
foreach ($applications as $app) {
  ...
  foreach ($app->getFiles() as $file) {
    ...
    $this->em->detach($file);
  }
  $this->em->detach($app);
}

Detaching the object should tell the entity manager to stop caring about this object and de-referencing it, but it surprisingly has no effect on the amount of memory usage - it keeps increasing.

Instead, I have to manually load the application files (instead of retrieving them through the association method), and the memory usage does not increase. This works:

$applications = $this->em->getRepository('AppBundle:Application')->findAll();
foreach ($applications as $app) {
  ...

  $appFiles = $this
      ->em
      ->getRepository('AppBundle:ApplicationFile')
      ->findBy(array('application' => $application));

  foreach ($appFiles as $file) {
    ...
    $this->em->detach($file);
  }
  $this->em->detach($app);
}

I used xdebug_debug_zval to track references to the $file object. In the first example, there's an extra reference somewhere, which explains why memory is ballooning - PHP is not able to garbage collect it!

Does anyone know why this is? Where is this extra reference and how do I remove it?

EDIT: Explicitly calling unset($file) at the end of its loop has no effect. There are still TWO references to the object at this point (proven with xdebug_debug_zval). One contained in $file (which I can unset), but there's another somewhere else that I cannot unset. Calling $this->em->clear() at the end of the main loop has no effect either.

EDIT 2: SOLUTION: The answer by @origaminal led me to the solution, so I accepted his answer instead of providing my own.

In the first method, where I access the files through the association on $application, this has a side effect of initializing the previously uninitialized $files collection on the $application object I'm iterating over in the outer loop.

Calling $em->detach($application) and $em->detach($file) only tells Doctrine's UOW to stop tracking the objects, but it doesn't affect the array of $applications I'm iterating on, which now have populated collection of $files which eat up memory.

I have to unset each $application object after I'm done with it to remove all references to the loaded $files. To do this, I modified the loops as such:

    $applications = $em->getRepository('AppBundle:Application')->findAll();
    $count = count($applications);
    for ($i = 0; $i < $count; $i++) {
        foreach ($applications[$i]->getFiles() as $file) {
            $file->getData();
            $em->detach($file);
            unset($file);
        }
        $em->detach($applications[$i]);
        unset($applications[$i]);

        // Don't NEED to force GC, but doing so helps for testing.
        gc_collect_cycles();
    }
like image 598
Brian Avatar asked Aug 12 '15 19:08

Brian


2 Answers

Cascade

EntityManager::detach should indeed remove all references Doctrine has to the enities. But it does not do the same for associated entities automatically.

You need to cascade this action by adding detach the cascade option of the association:

/**
 * @ORM\OneToMany(
 *   targetEntity="AppBundle\Entity\ApplicationFile",
 *   mappedBy="application",
 *   cascade={"remove", "detach"},
 *   orphanRemoval=true
 * )
 */
private $files;

Now $em->detach($app) should be enough to remove references to the Application entity as well as its associated ApplicationFile entities.

Find vs Collection

I highly doubt that loading the ApplicationFile entities through the association, in stead of using the repository to findBy() them, is the source of your issue.

Sure that when loaded through the association, the Collection will have a reference to those child-entities. But when the parent entity is dereferenced, the entire tree will be garbage collected, unless there are other references to those child entities.

I suspect the code you show is pseudo/example code, not the actual code in production. Please examine that code thoroughly to find those other references.

Clear

Sometimes it worth clearing the entire EntityManager and merging a few entities back in. You could try $em->clear() or $em->clear('AppBundle\Entity\ApplicationFile').

Clear has no effect

You're saying that clearing the EntityManager has no effect. This means the references you're searching for are not within the EntityManager (of UnitOfWork), because you've just cleared that.

Doctrine but not Doctrine

Are you using any event-listeners or -subscribers? Any filters? Any custom mapping types? Multiple EntityManagers? Anything else that could be integrated into Doctrine or its life-cycle, but is not necessarily part of Doctrine itself?

Especially event-listeners/subscribers are often overlooked when searching for the source of issues. So I'd suggest you start to look there.

like image 179
Jasper N. Brouwer Avatar answered Sep 30 '22 18:09

Jasper N. Brouwer


If we are speaking about your first implementation you have extra links to the collection in the PersistentCollection::coll of the Application::files property - this object is created by Doctrine on Application instantiation.

With detach you are just deleting UoW links to the object.

There are different ways to fix this but a lot of hacks should be applied. Most nice way probably to detach also Application object and unset it.

But it is still preferable to use more advanced ways for a batch processing: some were listed in the other answer. The current way forces doctrine to make use proxies and throws extra queries to DB to get the files of the current object.

Edit

The difference between the first and the second implementation is that there are no circular references in the second case: Application::files stays with uninitialized PersistenceCollection (with no elements in coll).

To check this - can you try to drop the files association explicitly?

like image 21
origaminal Avatar answered Sep 30 '22 18:09

origaminal