Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Flattening nested loops / decreasing complexity - complementary pairs counting algorithm

I was recently trying to solve some task in Python and I have found the solution that seems to have the complexity of O(n log n), but I believe it is very inefficient for some inputs (such as first parameter being 0 and pairs being very long list of zeros).

It has also three levels of for loops. I believe it can be optimized, but at the moment I cannot optimize it more, I am probably just missing something obvious ;)

So, basically, the problem is as follows:

Given list of integers (values), the function needs to return the number of indexes' pairs that meet the following criteria:

  • lets assume single index pair is a tuple like (index1, index2),
  • then values[index1] == complementary_diff - values[index2] is true,

Example: If given a list like [1, 3, -4, 0, -3, 5] as values and 1 as complementary_diff, the function should return 4 (which is the length of the following list of indexes' pairs: [(0, 3), (2, 5), (3, 0), (5, 2)]).

This is what I have so far, it should work perfectly most of the time, but - as I said - in some cases it could run very slowly, despite the approximation of its complexity O(n log n) (it looks like pessimistic complexity is O(n^2)).

def complementary_pairs_number (complementary_diff, values):
    value_key = {} # dictionary storing indexes indexed by values
    for index, item in enumerate(values):
        try:
            value_key[item].append(index)
        except (KeyError,): # the item has not been found in value_key's keys
            value_key[item] = [index]
    key_pairs = set() # key pairs are unique by nature
    for pos_value in value_key: # iterate through keys of value_key dictionary
        sym_value = complementary_diff - pos_value
        if sym_value in value_key: # checks if the symmetric value has been found
            for i1 in value_key[pos_value]: # iterate through pos_values' indexes
                for i2 in value_key[sym_value]: # as above, through sym_values
                    # add indexes' pairs or ignore if already added to the set
                    key_pairs.add((i1, i2))
                    key_pairs.add((i2, i1))
    return len(key_pairs)

For the given example it behaves like that:

>>> complementary_pairs_number(1, [1, 3, -4, 0, -3, 5])
4

If you see how the code could be "flattened" or "simplified", please let me know.

I am not sure if just checking for complementary_diff == 0 etc. is the best approach - if you think it is, please let me know.

EDIT: I have corrected the example (thanks, unutbu!).

like image 770
Tadeck Avatar asked Jan 13 '12 15:01

Tadeck


2 Answers

I think this improves the complexity to O(n):

  • value_key.setdefault(item,[]).append(index) is faster than using the try..except blocks. It is also faster than using a collections.defaultdict(list). (I tested this with ipython %timeit.)
  • The original code visits every solution twice. For each pos_value in value_key, there is a unique sym_value associated with pos_value. There are solutions when sym_value is also in value_key. But when we iterate over the keys in value_key, pos_value is eventually assigned to the value of sym_value, which make the code repeat the calculation it has already done. So you can cut the work in half if you can stop pos_value from equaling the old sym_value. I implemented that with a seen = set() to keep track of seen sym_values.
  • The code only cares about len(key_pairs), not the key_pairs themselves. So instead of keeping track of the pairs (with a set), we can simply keep track of the count (with num_pairs). So we can replace the two inner for-loops with

    num_pairs += 2*len(value_key[pos_value])*len(value_key[sym_value])
    

    or half that in the "unique diagonal" case, pos_value == sym_value.


def complementary_pairs_number(complementary_diff, values):
    value_key = {} # dictionary storing indexes indexed by values
    for index, item in enumerate(values):
        value_key.setdefault(item,[]).append(index)
    # print(value_key)
    num_pairs = 0
    seen = set()
    for pos_value in value_key: 
        if pos_value in seen: continue
        sym_value = complementary_diff - pos_value
        seen.add(sym_value)
        if sym_value in value_key: 
            # print(pos_value, sym_value, value_key[pos_value],value_key[sym_value])
            n = len(value_key[pos_value])*len(value_key[sym_value])
            if pos_value == sym_value:
                num_pairs += n
            else:
                num_pairs += 2*n
    return num_pairs
like image 200
unutbu Avatar answered Oct 03 '22 09:10

unutbu


You may want to look into functional programming idioms, such as reduce, etc.

Often times, nested array logic can be simplified by using functions like reduce, map, reject, etc.

For an example (in javascript) check out underscore js. I'm not terribly smart at Python, so I don't know which libraries they have available.

like image 33
Zee Spencer Avatar answered Oct 03 '22 08:10

Zee Spencer