Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Arrange/distribute array items uniformly

I have a multidimensional associative array with a type property. It looks like this:

$data = array(
  array( "name" => "SomeName", "type" => "A"),
  array( "name" => "SomeName", "type" => "A"),
  array( "name" => "SomeName", "type" => "A"),
  array( "name" => "SomeName", "type" => "A"),
  array( "name" => "SomeName", "type" => "A"),
  array( "name" => "SomeName", "type" => "B"),
  array( "name" => "SomeName", "type" => "B"),
  array( "name" => "SomeName", "type" => "B"),
  array( "name" => "SomeName", "type" => "C"),
  array( "name" => "SomeName", "type" => "C")
);

I want to rearrange it to make the items more equally distributed (with least amount of repetitive types if possible). It should look like this:

array(
  array( "name" => "SomeName", "type" => "A"),
  array( "name" => "SomeName", "type" => "B"),
  array( "name" => "SomeName", "type" => "A"),
  array( "name" => "SomeName", "type" => "C"),
  array( "name" => "SomeName", "type" => "A"),
  array( "name" => "SomeName", "type" => "B"),
  array( "name" => "SomeName", "type" => "A"),
  array( "name" => "SomeName", "type" => "C"),
  array( "name" => "SomeName", "type" => "A"),
  array( "name" => "SomeName", "type" => "B")
);

What I've tried so far was to find the count for each type and the total:

$count_a = 5;
$count_b = 3;
$count_c = 2;
$total = 10;

And also the rate ratio for each type:

$ratio_a = 0.5; //(5/10)
$ratio_b = 0.3; //(3/10)
$ratio_c = 0.2; //(2/10)

I'm just stuck here. Should I try to make a new property index with a number and then sort based on it? Or maybe use modulo operator somehow? I've also tried separating the items into 3 different arrays if that makes it any easier.

like image 817
Miro Avatar asked Feb 09 '17 22:02

Miro


4 Answers

Here's a solution that avoids repeating patterns whenever possible.

For AAAAABBBCC it would generate ABABABACAC;

For AAAAABBBCCC it would generate ABCABABACAC;

Apart from sorting by type count, it runs in linear time (it accepts an unsorted data array). The result is in $distributed_data. For explanation see below.

Code

$data = array(
  array( "name" => "SomeName", "type" => "A"),
  array( "name" => "SomeName", "type" => "A"),
  array( "name" => "SomeName", "type" => "A"),
  array( "name" => "SomeName", "type" => "B"),
  array( "name" => "SomeName", "type" => "B"),
);

$distributed_data = array();
$counts = array();
$size = sizeof($data);

// Count values
foreach ($data as $entry) {
  $counts[$entry["type"]] = isset($counts[$entry["type"]]) ? $counts[$entry["type"]] + 1 : 1;
}

// Set counter
for ($i = 0; $i < $size; $i++) {
  $data[$i]["count"] = $counts[$data[$i]["type"]];
}

// Sort by count
usort($data, function($entry1, $entry2) {
    return $entry2["count"] <=> $entry1["count"];
});

// Generate the distributed array
$max_length = $data[0]["count"];
$rows = ceil($size / $max_length);
$last_row = ($size - 1) % $max_length + 1;
$row_cycle = $rows;

$row = 0;
$col = 0;
for ($i = 0; $i < $size; $i++) {
  if ($i == $rows * $last_row) {
    $row_cycle -= 1;
  }

  $distributed_data[$i] = $data[$row * $max_length + $col];

  $row = ($row + 1) % $row_cycle;
  if ($row == 0) {
    $col++;
  }
}

Explanation

First, order the entries according to the number of repetitions each type has. E.g. CBBCAAB becomes BBBAACC.

Then imagine a table that has as many columns as the most frequent occurrence (e.g. if you have AAAABBCC, the most frequent occurrence would be 4, and the table would have 4 columns).

Then write all entries into the table, left to right, jumping to new row as necessary.

E.g. for AAAAABBBCCC you would get a table like this:

Table example

To generate the final distributed array just read the entries top-down, shifting to a new column as necessary.

In the above example, you would get ABCABABACAC.

The only way to get repeating entries is to either have two of the same characters in a column, or meet the same character when shifting to a column on the right.

The first scenario can't happen because a character group would need to wrap around and this can't happen, because there is no character group longer than the number of columns (that's how we defined the table).

The second scenario can only happen when the second row isn't full. E.g. AAAABB leaves the second row with two empty cells.

like image 99
naktinis Avatar answered Nov 14 '22 15:11

naktinis


Algorithm:

function distribute($data) {
    $groups = [];
    foreach ($data as $row) {
        $groups[$row['type']][] = $row;
    }
    $groupSizes = array_map('count', $groups);
    asort($groupSizes);

    $result = [];
    foreach ($groupSizes as $type => $groupSize) {
        if (count($result) == 0) {
            $result = $groups[$type];
        } elseif (count($result) >= count($groups[$type])) {
            $result = merge($result, $groups[$type]);
        } else {
            $result = merge($groups[$type], $result);
        }
    }
    return $result;
}

function merge($a, $b) {
    $c1 = count($a);
    $c2 = count($b);
    $result = [];
    $i1 = $i2 = 0;
    while ($i1 < $c1) {
        $result[] = $a[$i1++];
        while ($i2 < $c2 && ($i2+1)/($c2+1) < ($i1+1)/($c1+1)) {
            $result[] = $b[$i2++];
        }
    }
    return $result;
}

The main idea is to split the data into groups and merge the next smallest group into the result (starting with an empty result).

While merging two arrays the items are sorted by a float key, which is calculated (on the flow) in this line

while ($i2 < $c2 && ($i2+1)/($c2+1) < ($i1+1)/($c1+1))

as

floatKey = (index + 1) / (groupSize + 1)

(This part however can be improved, so the distance to the "corners" (0 and 1) would be half as big as the distance between two items).

On tie the item from the bigger group comes first.

Example: Merging AAAA and BB the keys for A would be 0.2, 0.4, 0.6, 0.8 anf for B - 0.33, 0.66. The result would be

A(0.2), B(0.33), A(0.4), A(0.6), B(0.66), A(0.8)

Tests:

$testData = [
    'AAAAABBBCC',
    'AAAAABBBCCC',
    'ABBCCC',
    'AAAAAABBC',
    'AAAAAABBBBCCD',
    'AAAAAAAAAABC',
    'hpp',
    'stackoverflow',
    'ACCD', // :-)
];

$results = [];

foreach ($testData as $dataStr) {
    $a = str_split($dataStr);
    $data = [];
    foreach ($a as $type) {
        $data[] = ['type' => $type];
    }
    $result = distribute($data);
    $resultStr = implode(array_column($result, 'type'));
    $results[$dataStr] = $resultStr;
}
var_export($results);

Test results:

'AAAAABBBCC' => 'BACABACABA',
'AAAAABBBCCC' => 'CABACABACAB',
'ABBCCC' => 'BCACBC',
'AAAAAABBC' => 'ABAACAABA',
'AAAAAABBBBCCD' => 'BACABADABACAB',
'AAAAAAAAAABC' => 'AAACAAAABAAA',
'hpp' => 'php',
'stackoverflow' => 'sakeofwlrovct',
'ACCD' => 'ACDC',

Test demo: http://rextester.com/BWBD90255

You can easily add more test cases to the demo.

like image 24
Paul Spiegel Avatar answered Nov 14 '22 14:11

Paul Spiegel


You should take a sorted array of sorted types and do iterative walk through it step by step changing selected type by one.

$data = array(
  array( "name" => "SomeName1", "type" => "A"),
  array( "name" => "SomeName2", "type" => "A"),
  array( "name" => "SomeName3", "type" => "A"),
  array( "name" => "SomeName4", "type" => "A"),
  array( "name" => "SomeName5", "type" => "A"),
  array( "name" => "SomeName6", "type" => "B"),
  array( "name" => "SomeName7", "type" => "B"),
  array( "name" => "SomeName8", "type" => "B"),
  array( "name" => "SomeName9", "type" => "C"),
  array( "name" => "SomeName0", "type" => "C")
);

$dataSorted = array();
$counts = array();

foreach($data as $elem) {
    // just init values for a new type
    if(!isset($counts[$elem['type']])) {
        $counts[$elem['type']] = 0;
        $dataByType[$elem['type']] =  array();
    }

    // count types
    $counts[$elem['type']]++;

    // save it to grouped array
    $dataByType[$elem['type']][] =  $elem;
}

// sort it to A=>5, B=>3 C=>2
arsort($counts, SORT_NUMERIC);

// get sorted types as an array
$types = array_keys($counts);

// index will be looped 0 -> count($types) - 1 and then down to 0 again
$currentTypeIndex = 0;

// make a walk on sorted array. First get the most popular, then less popular etc.
// when all types are added, repeat
while(count($dataSorted) < count($data)) {
    $currentType = $types[$currentTypeIndex];

    // skip adding if we ran out this type
    if($counts[$currentType]) {
        // pop an element of selected type
        $dataSorted[] = array_pop($dataByType[$currentType]);

        // decrease counter
        $counts[$currentType]--;
    }

    // choose next type
    $currentTypeIndex = (++$currentTypeIndex)%count($types);
}

print_r($dataSorted);

The code outputs elements in order of ABCABCABAA.

UPD. trailing doubling takes place in case count(maxtype) > count(nexttype) + 1

like image 4
shukshin.ivan Avatar answered Nov 14 '22 16:11

shukshin.ivan


$data = array(
  array( "name" => "SomeName", "type" => "A"),
  array( "name" => "SomeName", "type" => "A"),
  array( "name" => "SomeName", "type" => "A"),
  array( "name" => "SomeName", "type" => "A"),
  array( "name" => "SomeName", "type" => "A"),
  array( "name" => "SomeName", "type" => "B"),
  array( "name" => "SomeName", "type" => "B"),
  array( "name" => "SomeName", "type" => "B"),
  array( "name" => "SomeName", "type" => "C"),
  array( "name" => "SomeName", "type" => "C")
);

//make seperate arrays
echo "<pre>";
foreach($data as $val){

    ${$val["type"]}[]=$val["name"];
    $types[]=$val['type'];
}

$types=array_unique($types);

//make ratio
foreach($types as $val){
    $cnt[]=count($$val);
}
//find maximum from ratio
echo $max=max($cnt);
echo $min=min($cnt);


for($i=0;$i<$max;$i++){
    foreach($types as $val){

            if(isset($$val[$i])){
                $new_array[]=array("name"=>$$val[$i],"type"=>$val);
            }
        }
}

print_r($new_array);

Fiddle: http://phpfiddle.org/main/code/ju2k-abte

Explanation

     - Step 1: Make separate array   

     - Step 2: Count all array and find out the ratio



     - Step 3: Iterate with array with maximum ratio value
     - Step 4: Make array with same index together  and merge them in multidimensional
       array
like image 3
sumit Avatar answered Nov 14 '22 14:11

sumit