Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding all non-conflicting combinations of values from multiple lists of values

I have the following array which contains arrays of values:

$array = array(
    array('1', '2'),
    array('a', 'b', 'c'),
    array('x', 'y'),
);

There can be any number of arrays and an array can contain any number of values. I currently have a piece of code which will generate all combinations where one value is taken from each array. eg:

1ax, 1ay, 1bx, 1by, 1cx, 1cy, 2ax, 2ay, 2bx, 2by, 2cx, 2cy

However what I actually want is only the combinations where only one value sits in each column, ie. 1ax is no good because all three values 1, a and x sit in the first column, 1by is no good because b and y sit in the second column. So from the above example only these combinations would be valid:

1cy, 2cx

I originally planned to just generate all combinations and then filter out the ones with conflicts, but that doesn't scale as this is an oversimplified example, in the real application there's going to be situations where there are potentially millions of combinations (including conflicting ones).

Can anyone help with a better way to solve this? I'm working in PHP, but any code sample that clearly demonstrates the logic would be helpful.

Thanks in advance.


Update:

I've tested the solutions that work against a bigger dataset, to get some benchmarks, these are the results so far:

$array = array(
    array('1', '2', '3', '1', '2', '3', '1', '2', '3', '1', '2', '3', '1', '2', '3'),
    array('a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd'),
    array('x', 'y', 'z', 'x', 'y', 'z', 'x', 'y', 'z'),
    array('1', '2', '3', '1', '2', '3', '1', '2', '3'),
    array('a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd'),
    array('x', 'y', 'z'),
);

Josh Davis 2nd Solution:

Combinations:      249480
Time:              0.3180251121521 secs
Memory Usage:      22.012168884277 mb
Peak Memory Usage: 22.03059387207 mb

Josh Davis:

Combinations:      249480
Time:              1.1172790527344 secs
Memory Usage:      22.004837036133 mb
Peak Memory Usage: 22.017387390137 mb

Tom Haigh:

Combinations:      249480
Time:              5.7098741531372 secs
Memory Usage:      39.145843505859 mb
Peak Memory Usage: 39.145843505859 mb
like image 958
Jack Sleight Avatar asked Dec 09 '22 19:12

Jack Sleight


1 Answers

Interesting problem! This turned out to be more complex than I thought, but it seems to work.

Basic strategy is to first order the arrays smallest to largest (keeping track of what order they were in, so I can output the answers in the correct order).

I keep answers in the form of an array of indexes into this sorted array of input lists.

Now that the lists are sorted, I can store the first correct answer as array(0,1,2,...,n);

Then I recurse into a function for trying all values in the first slot there (the 0 above) by swapping it with other values in that answer array (all the ones that aren't too big for that slot). Since I've got it sorted by size, I can move any value to the right when I'm swapping, without worrying about it being to big for that right slot.

outputting each valid slot has some crazy indirection to undo all the sorting.

Sorry if this looks confusing. I didn't spend much time cleaning it up.

<?php
# $lists is an array of arrays
function noconfcombos($lists) {
    $lengths = array();
    foreach($lists as $list) {
        $lengths[] = count($list);
    }

    # find one solution (and make sure there is one)
    $answer = array();
    $sorted_lengths = $lengths;
    asort($sorted_lengths);
    $answer_order_lists = array();
    $answer_order_lengths = array();
    $output_order = array();
    $min = 1;
    $max_list_length = 0;
    foreach($sorted_lengths as $lists_key => $list_max) {
        if($list_max < $min) {
            # no possible combos
            return array();
        }
        $answer[] = $min - 1; # min-1 is lowest possible value (handing out colums starting with smallest rows)
        $output_order[$lists_key] = $min - 1; # min-1 is which slot in $answers corresponds to this list
        $answer_order_lists[] = $lists[$lists_key];
        $answer_order_lengths[] = $lengths[$lists_key];
        ++$min;
    }
    ksort($output_order);
    $number_of_lists = count($lists);
    $max_list_length = end($sorted_lengths);
    if($max_list_length > $number_of_lists) {
       for($i = $number_of_lists; $i < $max_list_length; ++$i) {
          $answer[] = $i;
       }
       $stop_at = $number_of_lists;
    } else {
       $stop_at = $number_of_lists - 1;
    }

    # now $answer is valid (it has the keys into the arrays in $list for the
    # answer), and we can find the others by swapping around the values in
    # $answer.

    $ret = array();
    $ret[] = noconfcombos_convert($answer, $answer_order_lists, $output_order);
    noconfcombos_recurse($ret, $max_list_length, $stop_at, $answer_order_lengths, $answer_order_lists, $output_order, $answer, 0);

    return $ret;
}

# try swapping in different indexes into position $index, from the positions
# higher, then recurse
function noconfcombos_recurse(&$ret, $max_list_length, $stop_at, &$lengths, &$lists, &$output_order, $answer, $index) {
    if($index < $stop_at) {
        noconfcombos_recurse($ret, $max_list_length, $stop_at, $lengths, $lists, $output_order, $answer, $index + 1);
    }
    for($other = $index + 1; $other < $max_list_length; ++$other) {
        if($answer[$other] < $lengths[$index]) { # && $answer[$index] < $lengths[$other]) {
            $tmp = $answer[$index];
            $answer[$index] = $answer[$other];
            $answer[$other] = $tmp;
            $ret[] = noconfcombos_convert($answer, $lists, $output_order);
            if($index < $stop_at) {
                noconfcombos_recurse($ret, $max_list_length, $stop_at, $lengths, $lists, $output_order, $answer, $index + 1);
            }
        }
    }
}


function noconfcombos_convert(&$indexes, &$lists, &$order) {
    $ret = '';
    foreach($order as $i) {
        $ret .= $lists[$i][$indexes[$i]];
    }
    return $ret;
}

function noconfcombos_test() {
    $a = array('1', '2', '3', '4');
    $b = array('a', 'b', 'c', 'd', 'e');
    $c = array('x', 'y', 'z');
    $all = array($a, $b, $c);
    print_r(noconfcombos($all));
}

noconfcombos_test();
like image 167
JasonWoof Avatar answered Dec 13 '22 11:12

JasonWoof