Let me elaborate my question using a simple example.I have a=[a1,a2,a3,a4], with all ai being a numerical value.
What I want to get is pairwise comparisons within 'a', such as I(a1>=a2), I(a1>=a3), I(a1>=a4), ,,,,I(a4>=a1), I(a4>=a2), I(a4>=a3), where I is a indicator function. So I used the following code.
res=[x>=y for x in a for y in a]
But it also gives the comparison results like I(a1>=a1),..,I(a4>=a4), which is always one. To get rid of these nuisance, I convert res into a numpy array and find the off diagonal elements.
res1=numpy.array(res)
This gives the result what I want, but I think there should be more efficient or simpler way to do pairwise comparison and extract the off diagonal element. Do you have any idea about this? Thanks in advance.
Pairwise Comparison Steps:Compute a mean difference for each pair of variables. Find the critical mean difference. Compare each calculated mean difference to the critical mean. Decide whether to retain or reject the null hypothesis for that pair of means.
The formula for the number of independent pairwise comparisons is k(k-1)/2, where k is the number of conditions. If we had three conditions, this would work out as 3(3-1)/2 = 3, and these pairwise comparisons would be Gap 1 vs.
Pairwise comparisons are methods for analyzing multiple population means in pairs to determine whether they are significantly different from one another.
Paired Comparison Analysis (also known as Pairwise Comparison) helps you work out the importance of a number of options relative to one another. This makes it easy to choose the most important problem to solve, or to pick the solution that will be most effective.
You could use NumPy broadcasting
-
# Get the mask of comparisons in a vectorized manner using broadcasting
mask = a[:,None] >= a
# Select the elements other than diagonal ones
out = mask[~np.eye(a.size,dtype=bool)]
If you rather prefer to set the diagonal elements as False
in mask
and then mask
would be the output, like so -
mask[np.eye(a.size,dtype=bool)] = 0
Sample run -
In [56]: a
Out[56]: array([3, 7, 5, 8])
In [57]: mask = a[:,None] >= a
In [58]: mask
Out[58]:
array([[ True, False, False, False],
[ True, True, True, False],
[ True, False, True, False],
[ True, True, True, True]], dtype=bool)
In [59]: mask[~np.eye(a.size,dtype=bool)] # Selecting non-diag elems
Out[59]:
array([False, False, False, True, True, False, True, False, False,
True, True, True], dtype=bool)
In [60]: mask[np.eye(a.size,dtype=bool)] = 0 # Setting diag elems as False
In [61]: mask
Out[61]:
array([[False, False, False, False],
[ True, False, True, False],
[ True, False, False, False],
[ True, True, True, False]], dtype=bool)
Runtime test
Reasons to use NumPy broadcasting
? Performance! Let's see how with a large dataset -
In [34]: def pairwise_comp(A): # Using NumPy broadcasting
...: a = np.asarray(A) # Convert to array if not already so
...: mask = a[:,None] >= a
...: out = mask[~np.eye(a.size,dtype=bool)]
...: return out
...:
In [35]: a = np.random.randint(0,9,(1000)).tolist() # Input list
In [36]: %timeit [x >= y for i,x in enumerate(a) for j,y in enumerate(a) if i != j]
1 loop, best of 3: 185 ms per loop # @Sixhobbits's loopy soln
In [37]: %timeit pairwise_comp(a)
100 loops, best of 3: 5.76 ms per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With