I was recently trying to solve some task in Python and I have found the solution that seems to have the complexity of O(n log n), but I believe it is very inefficient for some inputs (such as first parameter being 0
and pairs
being very long list of zeros).
It has also three levels of for
loops. I believe it can be optimized, but at the moment I cannot optimize it more, I am probably just missing something obvious ;)
So, basically, the problem is as follows:
Given list of integers (
values
), the function needs to return the number of indexes' pairs that meet the following criteria:
- lets assume single index pair is a tuple like
(index1, index2)
,- then
values[index1] == complementary_diff - values[index2]
is true,Example: If given a list like
[1, 3, -4, 0, -3, 5]
asvalues
and1
ascomplementary_diff
, the function should return4
(which is the length of the following list of indexes' pairs:[(0, 3), (2, 5), (3, 0), (5, 2)]
).
This is what I have so far, it should work perfectly most of the time, but - as I said - in some cases it could run very slowly, despite the approximation of its complexity O(n log n) (it looks like pessimistic complexity is O(n^2)).
def complementary_pairs_number (complementary_diff, values):
value_key = {} # dictionary storing indexes indexed by values
for index, item in enumerate(values):
try:
value_key[item].append(index)
except (KeyError,): # the item has not been found in value_key's keys
value_key[item] = [index]
key_pairs = set() # key pairs are unique by nature
for pos_value in value_key: # iterate through keys of value_key dictionary
sym_value = complementary_diff - pos_value
if sym_value in value_key: # checks if the symmetric value has been found
for i1 in value_key[pos_value]: # iterate through pos_values' indexes
for i2 in value_key[sym_value]: # as above, through sym_values
# add indexes' pairs or ignore if already added to the set
key_pairs.add((i1, i2))
key_pairs.add((i2, i1))
return len(key_pairs)
For the given example it behaves like that:
>>> complementary_pairs_number(1, [1, 3, -4, 0, -3, 5])
4
If you see how the code could be "flattened" or "simplified", please let me know.
I am not sure if just checking for complementary_diff == 0
etc. is the best approach - if you think it is, please let me know.
EDIT: I have corrected the example (thanks, unutbu!).
I think this improves the complexity to O(n)
:
value_key.setdefault(item,[]).append(index)
is faster than using
the try..except
blocks. It is also faster than using a collections.defaultdict(list)
. (I tested this with ipython %timeit.)pos_value
in value_key
, there is a unique sym_value
associated with
pos_value
. There are solutions when sym_value
is also in
value_key
. But when we iterate over the keys in value_key
,
pos_value
is eventually assigned to the value of sym_value
, which
make the code repeat the calculation it has already done. So you can
cut the work in half if you can stop pos_value
from equaling the
old sym_value
. I implemented that with a seen = set()
to keep
track of seen sym_value
s.The code only cares about len(key_pairs)
, not the key_pairs
themselves. So instead of keeping track of the pairs (with a
set
), we can simply keep track of the count (with num_pairs
). So we can replace the two inner for-loops with
num_pairs += 2*len(value_key[pos_value])*len(value_key[sym_value])
or half that in the "unique diagonal" case, pos_value == sym_value
.
def complementary_pairs_number(complementary_diff, values):
value_key = {} # dictionary storing indexes indexed by values
for index, item in enumerate(values):
value_key.setdefault(item,[]).append(index)
# print(value_key)
num_pairs = 0
seen = set()
for pos_value in value_key:
if pos_value in seen: continue
sym_value = complementary_diff - pos_value
seen.add(sym_value)
if sym_value in value_key:
# print(pos_value, sym_value, value_key[pos_value],value_key[sym_value])
n = len(value_key[pos_value])*len(value_key[sym_value])
if pos_value == sym_value:
num_pairs += n
else:
num_pairs += 2*n
return num_pairs
You may want to look into functional programming idioms, such as reduce, etc.
Often times, nested array logic can be simplified by using functions like reduce, map, reject, etc.
For an example (in javascript) check out underscore js. I'm not terribly smart at Python, so I don't know which libraries they have available.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With