Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm to determine all possible ways a group of values can be removed from a sequence

I'm trying to determine how many different ways I can remove a group of values from a sequence, leaving the original sequence in order (stable), and making sure to remove only 1 instance value each from the original sequence. For example, if I had [1,2,1,3,1,4,4] and I want to remove [1,4,4] my resulting combinations would be:

[1,2,1,3,1,4,4] \ [1,4,4] = [ [2,1,3,1], [1,2,3,1], [1,2,1,3] ]

or [1,2,1,3,1,4,4] \ [1,1] = [ [2,3,1,4,4], [1,2,3,4,4], [2,1,3,4,4] ]

I have javascript code I wrote to generate combinations of all array values without removal and the removal part seems like it should be easy but I'm not seeing the algorithm when needing to potentially remove multiple values multiple times.

like image 265
bkaid Avatar asked Aug 12 '16 16:08

bkaid


7 Answers

(Because it was unclear in the original version of the question whether you meant to remove a subsequence or an unordered list, I've updated my answer to address both cases.)

1. Removing a sub-sequence in order

We get a sequence of values, e.g. [1,5,1,3,4,2,3,2,1,3,1,2], and a sub-sequence of values to be removed from the first sequence, e.g. [1,3,2,1]. If we find where every instance of the values is situated in the sequence, we get a graph like this:

all connections

Finding all ways in which the values can be removed from the sequence, in order, then means finding all ways in which the to-be-removed values in the bottom row can be connected to an instance in the sequence, without any lines crossing, e.g.:

example solution

To avoid removing values in a way which doesn't lead to a valid solution (e.g. starting by removing the right-most value 1, after which there is no value 3 that can be removed) we will first remove all the connections in the graph that lead to such invalid solutions.

This will be done by iterating over the sub-sequence twice. First we do this left-to-right, and for each value, we look at its left-most connection, and remove any connections from the value to its right which meet or cross that connection; e.g. when considering the left-most connection from the value 2, two connections from the value 1 on its right (indicated in red) are removed because they cross this connection:

removing crossed connections ltr

In the next step we go from right to left, and for each value, we look at its right-most connection, and remove any connections from the value on its left which meet or cross that connection; e.g. when considering the right-most connection from the value 1 on the right, the right-most connection from the value 2 on its left (indicated in red) is removed because it crosses this connection:

removing crossed connections rtl

We then end up with the simplified graph shown below. The combinations are then made by combining every connected instance of each value in the sub-sequence, using e.g. recursion: you iterate over the options for the first value in the sub-sequence, and each time recurse with the rest of the sub-sequence, so that the option for the first value is combined with all the options for the other values.

simplified graph

There can be crossed connections in the simplified graph, but these are no longer problematic. In the example you'll see that even if the right connection is chosen for the value 3, there is a connection for the value 2 which doesn't cross with it:

example solution in simplified graph

Turning this into code is relatively straightforward; below the code snippet you will find a run-through of the code with the data from the example.

function removeSubSeq(seq, sub) {
    var posi = []; // list position of values in seq, then connect to positions in sub
    sub.forEach(function(elem) {posi[elem] = []});
    seq.forEach(function(elem, index) {if (posi[elem]) posi[elem].push(index)});
    var conn = sub.map(function(elem) {return posi[elem].slice()});

    for (var i = 1; i < conn.length; i++) {
        var left = conn[i - 1][0];
        while (conn[i][0] <= left) {
            conn[i].shift();                // remove crossed connection left-to-right
        }
    }
    for (var i = conn.length - 2; i >= 0; i--) {
        var right = conn[i + 1][conn[i + 1].length - 1];
        while (conn[i][conn[i].length - 1] >= right) {
            conn[i].pop();                  // remove crossed connection right-to-left
        }
    }
    var combinations = [], result = [];
    combine(0, -1, []);                     // initiate recursion to find combinations
    for (var i = 0; i < combinations.length; i++) {
        result[i] = seq.slice();            // create copies of seq and remove values
        for (var j = combinations[i].length - 1; j >= 0; j--) {
            result[i].splice(combinations[i][j], 1);
        }
    }
    return result;

    function combine(step, prev, comb) {    // generate combinations using recursion
        for (var i = 0; i < conn[step].length; i++) {
            var curr = conn[step][i];
            if (prev >= curr) continue;     // skip crossed connection
            if (step + 1 == conn.length) combinations.push(comb.concat([curr]));
            else combine(step + 1, curr, comb.concat([curr]));
        }
    }
}
var result = removeSubSeq([1,5,1,3,4,2,3,2,1,3,1,2], [1,3,2,1]);
for (var i in result) document.write(result[i] + "<br>");

The code performs these steps:

  • Create an associative array of the position of instances of each value present in sub:
posi[1] = [0,2,8,10], posi[2] = [5,7,11], posi[3] = [3,6,9]}  
  • Create a 2D array which links the positions in the sub-sequence to those in the sequence:
conn = [[0,2,8,10],[3,6,9],[5,7,11],[0,2,8,10]]  
  • Go over the array left-to-right and remove crossed connections:
conn = [[0,2,8,10],[3,6,9],[5,7,11],[8,10]]
  • Go over the array right-to-left and remove crossed connections:
conn = [[0,2],[3,6],[5,7],[8,10]]
  • Generate all combinations of the positions using recursion:
combinations = [[0,3,5,8],[0,3,5,10],[0,3,7,8],[0,3,7,10],
                [0,6,7,8],[0,6,7,10],[2,3,5,8],[2,3,5,10],
                [2,3,7,8],[2,3,7,10],[2,6,7,8],[2,6,7,10]]
  • Use the combinations to remove the values from copies of the sequence (see Code Snippet output).

2. Removing an unordered list of values

If the list of values to be removed is not necessarily a sub-sequence of the main sequence, and the values can be removed in any order, the same method as above can be used, with a relaxation of the crossed connection rules:

Removing crossed connections from the position list, and avoiding crossed connections when generating the combinations, only has to be done for identical values.

In this example, only the crossed connections for the duplicate values 1 are removed; first left-to-right:

removing crossed connections ltr

and then right-to-left:

removing crossed connections rtl

resulting in this simplified graph, with the problematic crossed connections for value 1 removed, and the crossed connections for values 2 and 3 remaining:

simplified graph

Below is a code example adapted from the version for sub-sequences; only a few lines in the code are changed, as indicated with comments (and I also used a different method to remove the values from the sequence). The list of values to-be-removed is sorted at the start, so that identical values are grouped together, to make it easy to keep track of identical values.

function removeSubList(seq, sub) {
    sub.sort(function(a, b) {return a - b});       /* SORT SUB-LIST FIRST */
    var posi = []; // list position of values in seq, then connect to positions in sub
    sub.forEach(function(elem) {posi[elem] = []});
    seq.forEach(function(elem, index) {if (posi[elem]) posi[elem].push(index)});
    var conn = sub.map(function(elem) {return posi[elem].slice()});

    for (var i = 1; i < conn.length; i++) {
        if (sub[i - 1] != sub[i]) continue;        /* SKIP FOR NON-IDENTICAL VALUES */
        var left = conn[i - 1][0];
        while (conn[i][0] <= left) {
            conn[i].shift();                // remove crossed connection left-to-right
        }
    }
    for (var i = conn.length - 2; i >= 0; i--) {
        if (sub[i] != sub[i + 1]) continue;        /* SKIP FOR NON-IDENTICAL VALUES */
        var right = conn[i + 1][conn[i + 1].length - 1];
        while (conn[i][conn[i].length - 1] >= right) {
            conn[i].pop();                  // remove crossed connection right-to-left
        }
    }
    var combinations = [], result = [];
    combine(0, -1, []);                     // initiate recursion to find combinations
    for (var i = 0; i < combinations.length; i++) {
        var s = seq.slice();                // create copy of seq and delete values
        combinations[i].forEach(function(elem) {delete s[elem]});
        result[i] = s.filter(function(elem) {return elem != undefined});
    }
    return result;

    function combine(step, prev, comb) {    // generate combinations using recursion
        for (var i = 0; i < conn[step].length; i++) {
            var curr = conn[step][i];
            if (prev >= curr && seq[prev] == seq[curr]) continue;   /* SKIP FOR NIV */
            if (step + 1 == conn.length) combinations.push(comb.concat([curr]));
            else combine(step + 1, curr, comb.concat([curr]));
        }
    }
}
var result = removeSubList([1,5,1,3,4,2,3,2,1,3,1,2], [1,3,1,2,1]);
for (var i in result) document.write(result[i] + "<br>");
like image 142

This can be done with simple combinatorics.

For simplicity, let's say values in original list are 1,2,3,...,n.
Let a[i] be the number of occurances of i in the original list.
Let b[i] be the number of occurances of i in the list of value removals

The number of options to reduce i is Choose(a[i],b[i]) = a[i]!/((a[i]-b[i])!b[i]!)

Since you combine all of these in "AND" closure, the total number of possibilities is:

Choose(a[1],b[1]) * Choose(a[2],b[2]) * ... * Choose(a[n], b[n])

Regarding values that are not in the reduction set, you don't need to worry about them. since their value in b list will be 0, and Choose(x,0) = 1 for all x


This leaves you with linear time solution (assuming you can calculate Choose(.,.) in constant time after doing some preprocessing to cache factorial values.


In your example you have:

a = [3, 1, 1, 2]
b = [1, 0, 0, 2]
Choose(3,1) = 3
Choose(1,0) = 1
Choose(1,0) = 1
Choose(2,2) = 1
#number_possibilities = 3 * 1 * 1 * 1 = 3
like image 40
amit Avatar answered Oct 18 '22 02:10

amit


If you'd like to enumerate unordered combinations of a multi-set of elements, you can do exactly that:

Record the positions in the array of the elements in the multi-set; enumerate all combinations of choose(indexes,multiplicity).

JavaScript code:

// straighforward choose(n,r) combinations
function choose(ns,r){
  if (r > ns.length) return [];
    
  var res = [];

  function _choose(i,_res){
    if (_res.length == r){
      res.push(_res);
      return;

    } else if (_res.length + ns.length - i == r){
      _res = _res.concat(ns.slice(i));
      res.push(_res);
      return
    }

    var temp = _res.slice();
    temp.push(ns[i]);

    _choose(i + 1,temp);
    _choose(i + 1,_res);
  }

  _choose(0,[]);
  return res;
}

// function to collect an array without specified indexes
function remove(arr,indexSet){
  var _arr = [];
  arr.forEach(function(v,i){ if (!indexSet.has(i)) _arr.push(arr[i]); });
  return _arr;
}

// main function
// the multiset is formatted as {element: [multiplicity,indexes]}
function removeAllCombs(arr,multiset){
  var res = [];
  
  // record the positions of multiset elements in the array
  arr.forEach(function(v,i){
    if (multiset[v]) multiset[v][1].push(i);
  });
  
  var keys = Object.keys(multiset);
  
  function enumerate(i,accum){
    if (i == keys.length){
      res.push(remove(arr,new Set(accum)));
      return;
    }
    
    var combs = choose(multiset[keys[i]][1],multiset[keys[i]][0]);

    for (let j in combs){
      var _accum = accum.slice();
      _accum = _accum.concat(combs[j]);
      
      enumerate(i + 1,_accum);
    }
  }
  
  enumerate(0,[]);
  return res;
}

console.log(JSON.stringify(
  removeAllCombs([1,2,1,3,1,4,4],{1: [1,[]], 4: [2,[]]})
));
like image 33
גלעד ברקן Avatar answered Oct 18 '22 00:10

גלעד ברקן


To determine all the ways a group of values (let's call this group needles) can be removed from a sequence (called haystack) do the following:

  1. Count how many times you have to remove each needle (let's call this count k). This can be determined by a single pass over the needles.
  2. Find all locations of each needle to be removed in the haystack. This can be determined by a single pass over the haystack.
  3. Generate all possible ways that you can remove each individual needle k times from the locations found. This is a standard enumeration of k-combinations and its time complexity is non-polynomial.
  4. Generate all possible ways you can combine each individual needle's removal possibilities together. This is a standard n-fold Cartesian product and its time complexity is also non-polynomial.
  5. For each way found, filter out the relevant elements from the haystack.

The following is an ECMAScript 2016 implementation of this approach:

function* removalCombinations(haystack, needles) {
  // Comments walk through sample input of haystack = [1,2,1,3,1,4,4] and needles = [1,4,4]

  // How many of each needle there are, e.g.,
  // needleCounts = { 1 => 1, 4 => 2 }
  let needleCounts = elementCounts(needles);

  // Where each needle is located, e.g.,
  // needleIndexes = { 1 => [ 0, 2, 4 ], 4 => [ 5, 6 ] }
  let needleIndexes = findIndices(needleCounts.keys(), haystack.entries());

  // The possible indices to be removed for a particular needle, e.g.,
  // indexCombinations = [ [ [ 0 ], [ 2 ], [ 4 ] ], [ [ 5, 6 ] ] ]
  var indexCombinations = [];
  for (let [needle, indexes] of needleIndexes) {
    indexCombinations.push(Array.from(generateCombinations(indexes, needleCounts.get(needle))));
  }

  // All the ways that the possible index removals can be fully combined together, e.g.,
  // fullRemovalCombinations = [ [ 0, 5, 6 ], [ 2, 5, 6 ], [ 4, 5, 6 ] ]
  let fullRemovalCombinations = cartesianProductOf(indexCombinations);

  // For every possible index removal combination,
  // filter those indices from the original haystack, e.g.,
  // yielded values = [ [ 2, 1, 3, 1 ], [ 1, 2, 3, 1 ], [ 1, 2, 1, 3 ] ]
  for (let indicesToFilter of fullRemovalCombinations) {
    indicesToFilter = new Set(indicesToFilter);
    yield haystack.filter((_, index) => !indicesToFilter.has(index))
  }

  // Calculates how many there are of each element.
  function elementCounts(iterable) {
    let counts = new Map();
    for (let el of iterable) {
      counts.set(el, counts.get(el) + 1 || 1);
    }
    return counts;
  }

  // Finds the indices of where each target occurs within iterable.
  function findIndices(targets, entries) {
    let indices = new Map();
    for (let el of targets) {
      indices.set(el, []);
    }
    for (let [index, value] of entries) {
      if (indices.has(value)) {
        indices.get(value).push(index);
      }
    }
    return indices;
  }

  // Generates all possible combinations of choosing k elements from arr.
  function* generateCombinations(arr, k) {
    function* doGenerateCombinations(offset, combo) {
      if (combo.length == k) {
        yield combo;
      } else {
        let len = arr.length;
        for (let i = offset; i < len; i++) {
          yield * doGenerateCombinations(i + 1, combo.concat(arr[i]));
        }
      }
    }

    yield* doGenerateCombinations(0, []);
  }

  // Given an array of arrays, generates all ways the elements can be combined together,
  // when taking a single element from each array.
  function* cartesianProductOf(arrays) {
    function* doCartesianProductOf(i, prod) {
      if (i == arrays.length) {
        yield prod;
      } else {
        for (let j = 0; j < arrays[i].length; j++) {
          yield* doCartesianProductOf(i + 1, prod.concat(arrays[i][j]));
        }
      }
    }

    yield* doCartesianProductOf(0, []);
  }
}

console.log(JSON.stringify(Array.from(removalCombinations([1, 2, 1, 3, 1, 4, 4], [1, 4, 4]))));
console.log(JSON.stringify(Array.from(removalCombinations([8, 6, 4, 4], [6, 4, 8]))));
like image 21
heenenee Avatar answered Oct 18 '22 02:10

heenenee


I think Branching and Pruning is the orthodox way of solving this problem and with much optimization possibility.
But, if you just want a simple and intuitive solution. Here it is.

First, find the numbers which is in the remove list.
[1,2,1,3,1,4,4][1,4,4]
from this we get [1,1,1,4,4]
Second, choose as many as the remove list elements from first step, which is combination 5C3.
from this we get [1,1,1] [1,1,4] [1,4,4] ....
Third, compare the sequence. then, you get the result.
Here is the code.. Sorry it is in C++, and I used a simple combination library.

#include<vector>
#include<algorithm>
#include<iostream>
#include"combi.h"
using namespace std;

int main()
{
    vector<int> list {1,2,1,3,1,4,4};
    vector<int> to_remove {1,4,4};
    vector<int> index;
    for(int i=0; i<list.size(); i++) {
        if(find(to_remove.begin(), to_remove.end(), list[i]) != to_remove.end())
            index.push_back(i);//insert index
    }
    bool sequence;
    nCr ncr(index.size(), to_remove.size());
    while(ncr.next()) {
        sequence = true;
        for(int i=0; i<ncr.size(); i++) 
            if(list[index[ncr[i]-1]] != to_remove[i]) sequence = false;
        if(sequence) {
            for(int i=0, j=0; i<list.size(); i++) {
                if(i == index[ncr[j]-1]) j++;
                else cout << list[i] << ' ';
            }
            cout << endl;
        }
    }
}

Here is the combination library..

class Combination
{
public:
    Combination(int n, int r);
    virtual ~Combination() { delete [] ar;}
    int& operator[](unsigned i) {return ar[i];}
    bool next();
    int size() {return r;}

protected:
    int* ar;
    int n, r;
};

class nCr : public Combination
{
public: 
    nCr(int n, int r);
    bool next();
};

Combination::Combination(int n, int r)
{
    ar = new int[r];
    this->n = n;
    this->r = r;
}

nCr::nCr(int n, int r) : Combination(n, r)
{
    if(r == 0) return;
    for(int i=0; i<r-1; i++) ar[i] = i + 1;
    ar[r-1] = r-1;
}

bool nCr::next()
{
    if(r == 0) return false;
    ar[r-1]++;
    int i = r-1;
    while(ar[i] == n-r+2+i) {
        if(--i == -1) return false;
        ar[i]++;
    }
    while(i < r-1) ar[i+1] = ar[i++] + 1;
    return true;
}
like image 27
Zeta Avatar answered Oct 18 '22 02:10

Zeta


Nice exercise, as usual it takes 1 timeunits to code and 10 to type :-). I cannot meet the language constraint, as i use a yet to be named language, hence i may be out of the competition. But i shall challenge everyone who provided a solution with a correctness check. Sorry for omitting the commas. Please check with these arguments:

[1 2 1 3 1 4 4] \ [1 4 4 1]

should produce the following solutions:

(2 3 1)(2 1 3)(1 2 3) 

And

[1 2 1 3 1 4 4] \ [1 4 4 1 1]

should produce the following solution:

(2 3)

And

[1 1 1 1 1] \ [1 1 1]

should (imho) produce the following solution:

(1 1)

And

[1] \ [2]

should (imho) produce the following solution:

[zero-length array]

And

[1 2 1 1 4 4 3 8 6 4 1 1 4 3 2 1] \ [1 1 4 1 1 1 3 4 8 6 2 2 4]

should produce the following solutions:

(4 3 1)(3 4 1)(1 4 3)(3 1 4)(4 1 3)(1 3 4) 

SOLUTION:

This won't be the simplest thing to implement, though it's pretty clear logically. I'm using the term "sub-array", like this:

(1 2 3)(4 5 6)(7 8 9 10) <- Array with 3 "sub-arrays", 3 "elements"

Step 1: Assign the arguments (following the original example)

arg = 1,2,1,3,1,4,4   
vec = 1,4,4   

Step 2: Check the uniques in vec, and how many of them there are.

A = 1,4 // The uniques in vec
B = 1,2 // Occurances of them

Step 3: Build indexes into arg for each of A (1-origin):

C = (1 3 5)(6 7) // 1 appears at indexes 1,3,5 in arg, 4 appears at indexes 6,7

Step 4: Take each element of C each element of B times:

D = (1 3 5)(6 7)(6 7) // B is (1,2) so we take (1 3 5) once and (6 7) twice.

Step 5: (The tricky step) Create all combinations of elements in D, using an outer join:

This happens by first creating all combinations of the two rightmost elements, ie. (6 7) and (6 7):

(6 6)(6 7)(7 6)(7 7) // (6 7) combined with (6 7) all possible ways

Then combine this with next D (towards left, that is):

E = (1 6 6)(1 6 7)(1 7 6)(1 7 7)(3 6 6)(3 6 7)(3 7 6)(3 7 7)(5 6 6)(5 6 7)(5 7 6)(5 7 7) // (1 3 5) combined with (6 6)(6 7)(7 6)(7 7) all possible ways

If there were more elements in D, we'd take them one by one (towards left) and combine with the achieved combinations so far. Until all elements of D are done (where "element" is a "sub-array").

Step 6: Remove such elements from E that contain duplicate numbers "inside" (eg. element (1 6 6) shall be removed):

F = (1 6 7)(1 7 6)(3 6 7)(3 7 6)(5 6 7)(5 7 6) // What is left from E

Step 7: Remove from F, when sub-arrays are sorted internally, such elements that are duplicates:

(1 6 7)(1 6 7)(3 6 7)(3 6 7)(5 6 7)(5 6 7) // F with sub-arrays sorted internally
G = (1 6 7)(3 6 7)(5 6 7)                  // Duplicate sub-arrays removed

Step 8: Almost ready! What we have now are the "non-indexes" into arg - those indexes that shall be excluded.

arg has 7 elements, hence all indexes into it are (1,2,3,4,5,6,7).

Picking the first element of G above, (1 6 7), means that indexes (1 2 3 4 5 6 7) without (1 6 7) is the first answer. All answers/indexes:

(1 2 3 4 5 6 7) without (1 6 7) -> (2 3 4 5). arg[2 3 4 5] is (2 1 3 1)
(1 2 3 4 5 6 7) without (3 6 7) -> (1 2 4 5). arg[1 2 4 5] is (1 2 3 1)
(1 2 3 4 5 6 7) without (5 6 7) -> (1 2 3 4). arg[1 2 3 4] is (1 2 1 3)

Hence the answers are

(2 1 3 1)(1 2 3 1)(1 2 1 3) 

Step 9: (Optional) The answer may contain duplicates at element level. Preserve only the uniques.

You can try this Dyalog APL one-liner at tryapl.org:

1 2 1 3 1 4 4 {↑∪{c[b~⍵]}¨{(∊⍵≡¨∪¨⍵)/⍵},⊃∘.,/(+/¨a=¨⊂⍵)/((a←∪⍵)=¨⊂⍺)/¨⊂b←⍳⍴c←⍺} 1 4 4

Paste and press [enter], you get:

2 1 3 1
1 2 3 1
1 2 1 3

You won't be able to test the very longest challenged sample above, as it exceeds the tryapl server's available processing time allocation, but feel free to test with any shorter arguments.

like image 21
Stormwind Avatar answered Oct 18 '22 02:10

Stormwind


Here is a solution that uses a reiterating function to reduce the values in steps. This function will not return solutions if not all values of the values that need removed are present in the starting array.

// Algorithm to strip values from an array
// Note, if not all elements of the stripValues array are found this function will return no solutions

function stripValues(startingValues, stripValues) {
    let solutions = []

    searchForSolutions(startingValues, stripValues, solutions, [])

    return solutions
}

function searchForSolutions(startingValues, stripValues, solvedSolutions, possibleSolution) {
    // If there are values to remove
    if(stripValues.length > 0) {
        // Copy the values of any possible solution to avoid tampering
        let newPossibleSolution = []
        possibleSolution.forEach((value) => {
            newPossibleSolution.push(value)
        })

        // Loop through the starting values looking for an instance of the first remaining value to strip
        for(i = 0; i < startingValues.length; i++) {
            if(startingValues[i] == stripValues[0]) {
                // The value was found, so reduce the arrays and look for the next element to remove
                let remainingStripValues = []
                stripValues.forEach((value, index) => {
                    if(index > 0) {
                        remainingStripValues.push(value)
                    }
                })

                let remainingValues = []
                for(j = i + 1; j< startingValues.length; j++) {
                    remainingValues.push(startingValues[j])
                }

                // Reiterate the search
                searchForSolutions(remainingValues, remainingStripValues, solvedSolutions, newPossibleSolution)
            }

            // Whether or not the value was found we want to continue finding permutations 
            newPossibleSolution.push(startingValues[i])
        }
    } else {
        //There are no remaining values to search for, so we have found a solution
        for(i = 0; i < startingValues.length; i++) {
            newPossibleSolution.push(startingValues[i]);
        }

        solvedSolutions.push(newPossibleSolution)
    }
}
like image 29
Isahiro Avatar answered Oct 18 '22 01:10

Isahiro