Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

array - out of memory in php

Tags:

arrays

php

I have a big array of associative arrays. Each associative array consists of around 15 keys of different types (string, integer, float) - Smaller example below:

$array = [
    [
      "key1" => "string",
      "key2" => 10,
      "key3" => 4.05
    ],       
    [
      "key1" => "string2",
      "key2" => 20,
      "key3" => 1.05
    ],       
   ...
];

Now I want to iterate over this array and add some keys like

$map = array_map(function (array $item) {            
       $item['key4'] = 1;
       $item['key5'] = 1;
       $item['key6'] = 1;
       return $item;
   }, $array);

Problem: For an array which contains quite a big number of associative arrays, adding new keys makes that there is memory limit reached and the script is terminated. Do you have any solutions?

like image 204
Joe Avatar asked Oct 02 '17 19:10

Joe


1 Answers

You could paginate your data, chunk your array to work with smaller pieces, or even increase memory_limit, but let's assume that you have a big array and can't do otherwise.

So let’s play with a 1 000 000 long array and try different solutions. I'll put the memory consumption & compute time measurements from my laptop

Current solution (857MB / 640ms)

for ($i=0; $i< 1000000; $i++){
    $array[$i] = [
        "key" => 'value',
        "key2" => $i,
        "key3" => $i / 3
    ];
}

$map = array_map(function (array $item) {
    $item['key4'] = 1;
    $item['key5'] = 1;
    $item['key6'] = 1;
    return $item;
}, $array);

With this piece of code the memory consumption on my laptop is 857MB and the compute time 640ms.

In your example you are creating a whole new $map variable from your $array. This means you are making a fresh copy of the array in memory.

Working with references (480MB / 220ms)

$array = [];
for ($i=0; $i< 1000000; $i++){
    $array[$i] = [
        "key" => 'value',
        "key2" => $i,
        "key3" => $i / 3
    ];
}

foreach ($array as &$item) {
    $item['key4'] = 1;
    $item['key5'] = 1;
    $item['key6'] = 1;
}

With the usage of &$item we asking PHP to give us access to the variable by reference, meaning that we are modifying the data directly in-memory without creating a new copy of it.

This is why it this script consumes a lot less memory & compute time.

Working with classes (223MB / 95ms)

Under the hood, PHP uses C data structures to manage data in memory. Classes are predictable and much easier for PHP to optimize than an array. It is well explained here

class TestClass {
    public $key1,   $key2, $key3,   $key4, $key5, $key6;
}

$array = [];
for ($i=0; $i< 1000000; $i++){
    $array[$i] = new TestClass();
    $array[$i]->key1 = 'value';
    $array[$i]->key2 = $i;
    $array[$i]->key3 = $i / 3;
}

foreach ($array as $item) {
    $item->key4 = 1;
    $item->key5 = 1;
    $item->key6 = 1;
}

You can see that the memory consumption & the time to iterate are much lower. This is because PHP don't need to modify the structure of the data in memory : every field of the object is ready to receive data.

Be careful, though, if you add a field that wasn't declared in the class (eg. $item->newKey = 1 : newKey is declared dynamically) : memory optimisation won't be possible anymore and you'll jump to 620mb memory usage & 280ms compute)


If you want to go further and are not afraid of headaches, take a look to the Standard PHP Library (SPL) : you will find a lot of optimized data structures solutions (Fixed Arrays, Iterators & so on...)

PS : benchmark made with Xdebug disabled

like image 147
Creaforge Avatar answered Oct 04 '22 04:10

Creaforge