Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Eloquent chunk() missing half the results

I have a problem with Laravel's ORM Eloquent chunk() method. It misses some results. Here is a test query :

$destinataires = Destinataire::where('statut', '<', 3)
    ->where('tokenized_at', '<', $date_active)
    ->chunk($this->chunk, function ($destinataires) {
        foreach($destinataires as $destinataire) {
            $this->i++;
        }
    }
echo $this->i;

It gives 124838 results.

But :

$num_dest = Destinataire::where('statut', '<', 3)
    ->where('tokenized_at', '<', $date_active)
    ->count();
echo $num_dest;

gives 249676, so just TWICE as the first code example.

My script is supposed to edit all matching records in the database. If I launch it multiple times, it just hands out half the remaining records, each time.

I tried with DB::table() instead of the Model. I tried to add a ->take(20000) but it doesn't seem to be taken into account. I echoed the query with ->toSql() and eveything seems to be fine (the LIMIT clause is added when I add the ->take() parameter).

Any suggestions ?

like image 947
Didier Sampaolo Avatar asked Sep 21 '15 16:09

Didier Sampaolo


2 Answers

Imagine you are using chunk method to delete all of the records. The table has 2,000,000 records and you are going to delete all of them by 1000 chunks.

$query->orderBy('id')->chunk(1000, function ($items) {
    foreach($items as $item) {
        $item->delete();
    }
});

It will delete the first 1000 records by getting first 1000 records in a query like this:

SELECT * FROM table ORDER BY id LIMIT 0,1000

And then the other query from chunk method is:

SELECT * FROM table ORDER BY id LIMIT 1000,2000

Our problem is here, that we delete 1000 records and then getting results from 1000 to 2000. Actually we are missing first 1000 records and this means that we are not going to delete 1000 records in first step of chunk! This scenario will be the same for other steps. In each step we are going to miss 1000 records and this is the reason that we are not getting best result in these situations.

I made an example for deletion because this way we could know the exact behavior of chunk method.


UPDATE:

You can use chunkById() for deleting safely.

Read more here:

http://laravel.at.jeffsbox.eu/laravel-5-eloquent-builder-chunk-chunkbyid https://laravel.com/api/5.4/Illuminate/Database/Eloquent/Builder.html#method_chunkById

like image 62
Misagh Laghaei Avatar answered Nov 13 '22 00:11

Misagh Laghaei


Quick answer: Use chunkById() instead of chunk().

When updating or deleting records while iterating over them, any changes to the primary key or foreign keys could affect the chunk query. This could potentially result in records not being included in the results.

The explanation can be found in the Laravel documentation:

Here is the solution example:

DB::table('users')->where('active', false)
    ->chunkById(100, function ($users) {
        foreach ($users as $user) {
            DB::table('users')
                ->where('id', $user->id)
                ->update(['active' => true]);
        }
    });

If you are updating database records while chunking results, your chunk results could change in unexpected ways. If you plan to update the retrieved records while chunking, it is always best to use the chunkById method instead. This method will automatically paginate the results based on the record's primary key.

(end of the update)

The original answer:

I had the same problem - only half of the total results were passed to the callback function of the chunk() method.

Here is the code which had problems:

Transaction::whereNull('processed')->chunk(100, function ($transactions) {
    $transactions->each(function($transaction){
        $transaction->process();
    });
});

I used Laravel 5.4 and managed to solve the problem replacing the chunk() method with cursor() method and changing the code accordingly:

foreach (Transaction::whereNull('processed')->cursor() as $transaction) {
    $transaction->process();
}

Even though the answer doesn't address the problem itself, it provides a valuable solution.

like image 32
svet Avatar answered Nov 13 '22 01:11

svet