Spearman rank correlation in Python with ties

Question

I want to compute the spearman rank correlation using Python and most likely scipy implementation (scipy.stats.spearmanr).

The data at hand looks e.g., the following way (dictionaries):

{a:0.3, b:0.2, c:0.2} and {a:0.5, b:0.6, c:0.4}

To now pass it over to the spearman module, I would assign them ranks, if I am correct (descending):

[1,2,3] and [2,1,3]

So now I want to consider ties, so would I now use for the first vector:

[1,2,2] or [1,2.5,2.5]

Basically, is this whole concept correct and how to handle ties for such dictionary-based data.

As suggested by @Jaime the spearmanr function works with values, but why is this behavior possible:

In [5]: spearmanr([0,1,2,3],[1,3,2,0])
Out[5]: (-0.39999999999999997, 0.59999999999999998)

In [6]: spearmanr([10,7,6,5],[0.9,0.5,0.6,1.0])
Out[6]: (-0.39999999999999997, 0.59999999999999998)

Thanks!

Jaime · Accepted Answer

scipy.stats.spearmanr will take care of computing the ranks for you, you simply have to give it the data in the correct order:

>>> scipy.stats.spearmanr([0.3, 0.2, 0.2], [0.5, 0.6, 0.4])
(0.0, 1.0)

If you have the ranked data, you can call scipy.stats.pearsonr on it to get the same result. And as the examples below show, either of the ways you have tried will work, although I think [1, 2.5, 2.5] is more common. Also, scipy uses zero-based indexing, so the ranks internally used will be more like [0, 1.5, 1.5]:

>>> scipy.stats.pearsonr([1, 2, 2], [2, 1, 3])
(0.0, 1.0)
>>> scipy.stats.pearsonr([1, 2.5, 2.5], [2, 1, 3])
(0.0, 1.0)

Spearman rank correlation in Python with ties

Tags:

python

statistics

scipy

correlation

fsociety

1 Answers

Jaime

Recent Activity

Donate For Us

Spearman rank correlation in Python with ties

Tags:

python

statistics

scipy

correlation

fsociety

1 Answers

Jaime

Related questions

Recent Activity

Donate For Us