Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Skip Entities while flushing when they are a Duplicate

i'm playing a little bit with Symfony2 and Doctrine2.

I have an Entity that has a unique title for example:

class listItem
{
    /**
     * @orm:Id
     * @orm:Column(type="integer")
     * @orm:GeneratedValue(strategy="AUTO")
     */
    protected $id;

    /**
     * @orm:Column(type="string", length="255", unique="true")
     * @assert:NotBlank()
     */
    protected $title;

now i'm fetching a json and updating my database with those items:

$em = $this->get('doctrine.orm.entity_manager');
        
foreach($json->value->items as $item) {
    $listItem = new ListItem();
    $listItem->setTitle($item->title);
    $em->persist($listItem);
}

$em->flush();

works fine the first time. but the second time i'm getting an sql error (of course): Integrity constraint violation: 1062 Duplicate entry

sometimes my json file gets updated and some of the items are new, some are not. Is there a way to tell the entity manager to skip the duplicate files and just insert the new ones?

Whats the best way to do this?

Thanks for all help. Please leave a comment if something is unclear

Edit:

what works for me is doing something like this:

$uniqueness = $em->getRepository('ListItem')->checkUniqueness($item->title);
    if(false == $uniqueness) {
        continue;
    }

    $listItem = new ListItem();
    $listItem->setTitle($item->title);
    $em->persist($listItem);
    $em->flush();
}

checkUniqueness is a method in my ListItem Repo that checks if the title is already in my db.

thats horrible. this are 2 database queries for each item. this ends up about 85 database queries for this action.

like image 573
choise Avatar asked Apr 17 '11 16:04

choise


1 Answers

How about retrieving all the current titles into an array first and checking the inserting title against the current titles in that array

$existingTitles = $em->getRepository('ListItem')->getCurrentTitles();

foreach($json->value->items as $item) {
  if (!in_array($item->title, $existingTitles)) {
    $listItem = new ListItem();
    $listItem->setTitle($item->title);
    $em->persist($listItem);
  }
}

$em->flush();

getCurrentTitles() would need to be added to ListItem Repo to simply return an array of titles.

This only requires one extra DB query but does cost you more in memory to hold the current titles in an array. There maybe problems with this method if your dataset for ListItem is very big.

If the number of items your want to insert each time isn't too large, you could modify the getCurrentTitles() function to query for all those items with the titles your trying to insert. This way the max amount of $existingTiles you will return will be the size of your insert data list. Then you could perform your checks as above.

// getCurrentTitles() - $newTitles is array of all new titles you want to insert
return $qb->select('title')
   ->from('Table', 't')
   ->in('t.title = ', $newTitles)
   ->getArrayResult();
like image 188
d.syph.3r Avatar answered Oct 21 '22 16:10

d.syph.3r