I have a correlation matrix, but specified as pairs, like:
cm = pd.DataFrame({'name1': ['A', 'A', 'B'],
'name2': ['B', 'C', 'C'],
'corr': [0.1, 0.2, 0.3]})
cm
name1 name2 corr
0 A B 0.1
1 A C 0.2
2 B C 0.3
What is the simplest way to make this into a numpy 2d array correlation matrix?
A B C
A 1.0 0.1 0.2
B 0.1 1.0 0.3
C 0.2 0.3 1.0
Not sure about pure numpy
since you are dealing with a pandas dataframe. Here's a pure pandas solution:
s = cm.pivot(*cm)
ret = s.add(s.T, fill_value=0).fillna(1)
Output:
A B C
A 1.0 0.1 0.2
B 0.1 1.0 0.3
C 0.2 0.3 1.0
Extra: for reverse (ret
is as above)
(ret.where(np.triu(np.ones(ret.shape, dtype=bool),1))
.stack()
.reset_index(name='corr')
)
Output:
level_0 level_1 corr
0 A B 0.1
1 A C 0.2
2 B C 0.3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With