Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP array_merge() with preference of first array and unique values only?

I would like to merge multiple arrays together while taking preference of the values from the first array and only having unique values. Is there a quicker way than using array_merge(), array_unique(), and the + operator?

function foo(...$params) {
    $a = [
        'col1',
        'col2_alias' => 'col2',
        'col3'
    ];
    $merged = array_merge($a, ...$params);
    $unique = array_unique($merged);
    print_r($merged);
    print_r($unique);
    print_r($a + $unique);
}

foo(
    ['col4', 'col5_alias' => 'col5', 'col6'], 
    ['col7', 'col1', 'col5_alias' => 'col5', 'col2_alias' => 'col10']);

Just merging the arrays gives me duplicate values, and overwrites values in the first array:

Array
(
    [0] => col1 // duplicate
    [col2_alias] => col10 // overwritten
    [1] => col3
    [2] => col4
    [col5_alias] => col5
    [3] => col6
    [4] => col7
    [5] => col1 // duplicate
)

Using array_unique() obviously fixes the duplicate values, but not the overwritten value:

Array
(
    [0] => col1
    [col2_alias] => col10
    [1] => col3
    [2] => col4
    [col5_alias] => col5
    [3] => col6
    [4] => col7
)

After using the + operator, the array is how I want it.

Array
(
    [0] => col1
    [col2_alias] => col2
    [1] => col3
    [2] => col4
    [col5_alias] => col5
    [3] => col6
    [4] => col7
)
like image 297
Katrina Avatar asked Oct 30 '22 01:10

Katrina


1 Answers

You're right to assume that using the array_merge, array_unique functions, and + operator would be slow. And I've written a bit of code to benchmark the speed of each combination...

Here is that code...

<?php

class ArraySpeeds
{
    public $la = ['col1', 'col2_alias' => 'col2', 'col3'];
    public $a = ['col4', 'col5_alias' => 'col5', 'col6'];
    public $b = ['col7', 'col1', 'col5_alias' => 'col5', 'col2_alias' => 'col10'];
    public $c = [];

    public function executionTime ($callback)
    {
        $start = microtime (true);

        for ($i = 0; $i < 1000000; $i++) {
            $callback ();
        }

        return round ((microtime (true) - $start) * 1000) . '/ms' . PHP_EOL;
    }

    public function getTimes ()
    {
        $array_merge_time = $this->executionTime (function () {
            $this->c[0] = array_merge ($this->la, $this->a, $this->b);
        });

        $array_unique_time = $this->executionTime (function () {
            $merged = array_merge ($this->la, $this->a, $this->b);
            $this->c[1] = array_unique ($merged);
        });

        $addition_time = $this->executionTime (function () {
            $merged = array_merge ($this->la, $this->a, $this->b);
            $unique = array_unique ($merged);
            $this->c[2] = $this->la + $unique;
        });

        $array_diff_time = $this->executionTime (function () {
            $merged = array_merge ($this->a, $this->b);
            $diffed = array_diff ($merged, $this->la);

            $this->c[3] = array_merge ($diffed, $this->la);
        });

        echo print_r ($this->c[0], true), PHP_EOL;
        echo print_r ($this->c[1], true), PHP_EOL;
        echo print_r ($this->c[2], true), PHP_EOL;

        natsort ($this->c[3]);
        echo print_r ($this->c[3], true), PHP_EOL;

        echo 'array_merge: ', $array_merge_time;
        echo 'array_unique: ', $array_unique_time;
        echo 'addition: ', $addition_time;
        echo 'array_diff: ', $array_diff_time;
    }
}

$arrayspeeds = new ArraySpeeds ();
$arrayspeeds->getTimes ();

This is the output...

Array
(
    [0] => col1
    [col2_alias] => col10
    [1] => col3
    [2] => col4
    [col5_alias] => col5
    [3] => col6
    [4] => col7
    [5] => col1
)

Array
(
    [0] => col1
    [col2_alias] => col10
    [1] => col3
    [2] => col4
    [col5_alias] => col5
    [3] => col6
    [4] => col7
)

Array
(
    [0] => col1
    [col2_alias] => col2
    [1] => col3
    [2] => col4
    [col5_alias] => col5
    [3] => col6
    [4] => col7
)

Array
(
    [3] => col1
    [col2_alias] => col2
    [4] => col3
    [0] => col4
    [col5_alias] => col5
    [1] => col6
    [2] => col7
)

array_merge: 403/ms
array_unique: 1039/ms
addition: 1267/ms
array_diff: 993/ms

You can see the execution time gets longer with each added function call, with the array_merge, array_unique functions and + operator being the slowest, more than twice as slow.

However, using array_diff will get you a decent performance with the correct output, but without correct sorting. Adding a natsort function call to the array would fix that.

For example...

function foo (...$params)
{
    $a = [
        'col1',
        'col2_alias' => 'col2',
        'col3'
    ];

    $diff = array_diff (array_merge (...$params), $a);
    $merged = array_merge ($diff, $a);
    natsort ($merged);
    print_r ($merged);
}
like image 146
Coffee'd Up Hacker Avatar answered Nov 15 '22 06:11

Coffee'd Up Hacker