Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do a pairwise comparison of each element in two sets and return a top 3 ranklist

Given two sets, how do I perform a pairwise comparison of each element in one set with each element of the other set.

I would like to get top 3 results for each element in the initial set.\

Is there a faster way to solve the task. I am looking for a more pythonic way of doing the task.

set1 = set([str(item) for item in range(100)])   # Pls. note originally set contains strings
set2 = set([str(item) for item in range(50,150)]) # set([str(item) for item in range(50,100)])

for item in set1:
  max = [-1,-1,-1]
  for stuff in set2:
    val = magicComp(item,stuff)
    if val > max[0]:
        max[2] = max[1]
        max[1] = max[0]
        max[0] = val
    elif val > max[1]:
        max[2] = max[1]
        max[1] = val
    elif val > max[2]:
        max[2] = val
like image 972
Amrith Krishna Avatar asked Apr 14 '16 11:04

Amrith Krishna


People also ask

How do you do a Pairwise Comparison?

Complete Pairwise Comparison You can calculate the total number of pairwise comparisons using a simple formula: n(n-1)/2 , where n is the number of options. For example, if we have 20 options, this would be 20(19)/2 → 380/2 → 190 pairs.

How many possible pairwise comparisons are there between 3 groups?

If the study includes three groups – A, B and C – up to three pairwise comparisons can be conducted in the form of hypothesis tests. And, if the study includes four groups – A, B, C and D – up to six pairwise comparisons are possible: A-B, A-C, A-D, B-C, B-D and C-D.

How can the number of pairwise comparisons be determined?

The total number of pairwise comparisons in any given design can be determined by a(a − 1)/2, where a is the total number of groups in the design (Keppel, 1982). Consequently, as the number of statistical tests on any given data set increases, the vulnerability to making a Type I error increases.


1 Answers

Your answer's not bad, it's better than sorting the array on each iteration, but it's still O(N^2).

Since you know the array indices that you want, you can use the quickselect algorithm to find indices 0,1,2 based on the magicComp function in O(log n) time. This'll reduce your run-time to O(n*log n)

Based on the code in that link, your code would look something like:

results = {}
ls2 = list(set2)
for el in set1:
    results[el] = [select(ls2, ii) for ii in [0,1,2]]
like image 114
gct Avatar answered Sep 21 '22 00:09

gct