Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Improving algorithm that uses the cartesian product

Problem

Say you have n lists of integers, where each list only includes integers in the range from 1 to n. For example, for n = 4, we might have:

a_1 = [1, 2]
a_2 = [3]
a_3 = [4, 1, 1]
a_4 = [2, 3]

Now my question is: Can I tick off all the integers between 1 and n in those n lists but with the catch that once I find one number, I can't use that list anymore to find subsequent numbers?

For example, in the above example with n = 4, I can choose 1 from a_1, 2 from a_4, 3 from a_2, and 4 for a_3, and I have therefore filled all the numbers from 1 to 4 but only using each list once.

An example where I can't find the range (and hence should return False) would be:

a_1 = [1, 2]
a_2 = [3, 3, 5]
a_3 = [4]
a_4 = [5]
a_5 = [3, 4, 5]

The reason is because if I choose 1 from a_1, I can't choose 2 anymore from any list.

Approach

This is my current straightforward approach. I make a cartesian product of the lists and check if there is any that, sorted, will be a range.

import itertools

def fillRange(lists):
  cartesian = itertools.product(*lists)
  return any([sorted(comb) == range(1, len(lists) + 1) for comb in cartesian])

Question

Although my approach works, for large lists it becomes a bit inefficient. Any thoughts on how I can improve this algorithm?

Thanks!

like image 563
McGuire Avatar asked Aug 25 '17 13:08

McGuire


People also ask

What is the use of Cartesian product in Computer Science?

Ans: The basic use of a cartesian product is to find out the set of all possible ordered pairs from given sets. One real-life application is that most computers show images as a raster of dots known as pixels, each of which can be addressed using its coordinates.

Who coined the term Cartesian?

René Descartes, a French mathematician and philosopher has coined the term Cartesian. The cartesian product of 2 non-empty sets A and B is the set of all possible ordered pairs where the first component is from A and the second component is from B. The resultant collection of ordered pairs is denoted by A × B.

What is the Cartesian product of 2 non-empty sets?

The cartesian product of 2 non-empty sets A and B is the set of all possible ordered pairs where the first component is from A and the second component is from B. The resultant collection of ordered pairs is denoted by A × B.

What are some real-life applications of coordinates in computer graphics?

One real-life application is that most computers show images as a raster of dots known as pixels, each of which can be addressed using its coordinates. These coordinates are ordered pairs and so Cartesian product elements.


2 Answers

You can formulate this as a maximum flow problem in a bipartite graph where the left nodes correspond to lists, and the right nodes correspond to integers 1 to n.

There is an edge in the graph iff the integer is in the corresponding list.

All capacities in the graph are equal to 1.

If you can find a flow of size n from the left side to the right side then the problem is soluble.

Python code to do this below:

import networkx as nx

a_1 = [1, 2]
a_2 = [2]
a_3 = [4, 1, 1]
a_4 = [2, 3]
A = [a_1,a_2,a_3,a_4]
n = 4

G=nx.DiGraph()
for i,a in enumerate(A):
    for j in set(a):
        l = 'list'+str(i)
        G.add_edge(l,j,capacity=1)
        G.add_edge('start',l,capacity=1)
for j in range(1,n+1):
    G.add_edge(j,'dest',capacity=1)
v,flow = nx.maximum_flow(G,'start','dest')
if v<n:
    print 'Impossible'
else:
    for i,a in enumerate(A):
        for j in set(a):
            if flow['list'+str(i)][j]>0:
                print 'Use',j,'from list',a

This prints:

Use 1 from list [1, 2]
Use 2 from list [2]
Use 4 from list [4, 1, 1]
Use 3 from list [2, 3]
like image 154
Peter de Rivaz Avatar answered Oct 15 '22 10:10

Peter de Rivaz


Instead of testing all the combinations in order, you can speed this up a lot by testing the most-constrained lists first, and also updating the alternatives in the other lists as you add elements to your solution set. This way, you can "solve" both your examples without backtracking once.

def search(n, lists):
    if n == 0:
        yield []
    else:
        lists = [l for l in lists if l != []]
        if len(lists) >= n:
            least = min(lists, key=len)
            for val in least:
                new = [[x for x in lst if x != val] for lst in lists if lst is not least]
                for res in search(n-1, new):
                    yield [val] + res

Here's some debugging/tracing output for your two examples to help with the understanding. First value is n, then the lists, and finally the previously chosen val.

4 [[1, 2], [3], [4, 1, 1], [2, 3]] None
3 [[1, 2], [4, 1, 1], [2]] 3
2 [[1], [4, 1, 1]] 2
1 [[4]] 1
0 [] 4 --> solution
[3, 2, 1, 4]

5 [[1, 2], [3, 3, 5], [4], [5], [3, 4, 5]] None
4 [[1, 2], [3, 3, 5], [5], [3, 5]] 4
3 [[1, 2], [3, 3], [3]] 5
2 [[1, 2], []] 3 --> abort

If you also want the indices of the lists the elements have been taken from, the code gets a little more complicated, but not much:

def search(n, lists):
    if n == 0:
        yield []
    else:
        if sum(1 for l in lists if l) >= n:
            i = min(range(len(lists)), key=lambda x: (lists[x] == [], len(lists[x])))
            for val in lists[i]:
                new = [[x for x in lst if x != val] if lst is not lists[i] else [] for lst in lists]
                for res in search(n-1, new):
                    yield [(i, val)] + res

Result for your first example then is [(1, 3), (3, 2), (0, 1), (2, 4)]

like image 28
tobias_k Avatar answered Oct 15 '22 09:10

tobias_k