Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The best way to break generator into chunks

Can you help me to write this code that breaks generator yields into the chunks by 100 and save them into database more beautiful.

$batchSize = 100;

$batch = [];
$i = 0;

/** 
 * @yield array $item
 */
foreach(itemsGenerator() as $item) {
    $batch[] = $item;
    $i++;

    if ($i === $batchSize) {
        Db::table('items')->save($batch);

        $batch = [];
        $i = 0;
    }

    $cnt++;
}

if ($batch) {
     Db::table('items')->save($batch);
}

I don't want to put the logic of breaking into chunks in itemsGenerator

like image 523
mikatakana Avatar asked Mar 14 '23 04:03

mikatakana


1 Answers

You can put the chunk logic into a separate reusable function.

Solution 1: Every chunk is a generator.

https://3v4l.org/3eSQm

function chunk(\Generator $generator, $n) {
    for ($i = 0; $generator->valid() && $i < $n; $generator->next(), ++$i) {
        yield $generator->current();
    }
}

function foo() {
  for ($i = 0; $i < 11; ++$i) {
    yield $i;
  }
}

for ($gen = foo(); $gen->valid();) {
    $chunk = [];
    foreach (chunk($gen, 3) as $value) {
        $chunk[] = $value;
    }
    print json_encode($chunk) . "\n";
}

Solution 2: Every chunk is an array.

https://3v4l.org/aSfeR

function generator_chunks(\Generator $generator, $max_chunk_size) {
  $chunk = [];
  foreach ($generator as $item) {
    $chunk[] = $item;
    // @todo A local variable might be faster than count(), but adds clutter to the code. So using count() for this example code.
    if (count($chunk) >= $max_chunk_size) {
      yield $chunk;
      $chunk = [];
    }
  }
  if ([] !== $chunk) {
      // Remaining chunk with fewer items.
      yield $chunk;
  }
}

function generator() {
    for ($i = 0; $i < 11; ++$i) {
        yield $i;
    }
}

foreach (generator_chunks(generator(), 3) as $chunk) {
    print json_encode($chunk) . "\n";
}

Now all of one chunk will be in memory at once as an array, but not the entire sequence.

There might be ways to make each chunk behave like a generator. But this is a different story for another day.

like image 182
donquixote Avatar answered Mar 20 '23 17:03

donquixote