Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most efficient way to do batch INSERT IGNORE using doctrine 2

I have a script that needs go get a list of entries in the database then iterate over those creating new entries in another table if they dont already exists.

Currently im doing:

foreach($entries as $entry){
    $newItem = new Item();
    $newItem->setAttribute($entry->getAttribute());
    $entityManager->persist($newItem);
    try{
        $entityManager->flush();
    } catch(\Exception $e){
        if(!strpos($e->getMessage(),'Duplicate')){
            throw $e;
        }
        $entityManager = $this->getDoctrine()->getManager(); 
        //refreshes the entity manager
    }

}

However doing it this way is very time intensive, there are 1000's of entries and the script some times takes upwards of 10 minutes to complete. I have seen other posts suggest when doing batch processing like this to flush every 20 or so records the problem with that is that if one of those 20 are a duplicate then the whole transaction dies, im not sure how i would go back and try and find the offending entry to exclude it before resubmitting them again.

Any help with this will be greatly appreciated.

like image 863
Chase Avatar asked Feb 05 '14 22:02

Chase


1 Answers

You can do one SELECT to fetch records that already exist in database, and later just skip these records. Additionally, try to execute flush() and clear() just once or play around with the batch size. I would also suggest to use transaction (if you use InnoDB).

$this->_em->getConnection()
    ->beginTransaction();

try {
    $created = array(/* all primary keys that already exist */);
    $i = 1;
    $batchSize = sizeof($entries);
    foreach ($entries as $entry) {

        if (in_array($entry->getMyPrimaryKey(), $created)) {
            continue;
        }

        $newItem = new Item();
        $newItem->setAttribute($entry->getAttribute());
        $entityManager->persist($newItem);

        if (($i % $batchSize) == 0) {
            $this->_em->flush();
            $this->_em->clear();
        }

        $i++;
    }

    $this->_em->getConnection()
        ->commit();
} catch (\Exception $e) {
    $this->_em->getConnection()
        ->rollback();
    $this->_em->close();

    throw new \RuntimeException($e->getMessage());
} 
like image 71
b.b3rn4rd Avatar answered Sep 28 '22 07:09

b.b3rn4rd