I have a pairwise matrix:
>>> m
a b c d
a 1.0 NaN NaN NaN
b 0.5 1.0 NaN NaN
c 0.6 0.0 1.0 NaN
d 0.5 0.4 0.3 1.0
I want to replace the NaN in the the top right with the same values as in the bottom left:
>>> m2
a b c d
a 1.0 0.5 0.6 0.5
b 0.5 1.0 0.0 0.4
c 0.6 0.0 1.0 0.3
d 0.5 0.4 0.3 1.0
I can do it by swapping columns and indexes:
cols = m.columns
idxs = m.index
for c in cols:
for i in idxs:
m[i][c] = m[c][i]
But that's slow with my actual data, and I'm sure there's a way to do it in one step. I know I can generate the upper right version with "m.T" but I don't know how to replace NaN with non-NaN values to get the complete matrix. There's probably a single-step way to do this in numpy, but I don't know from matrix algebra.
The transpose of the matrix is denoted by using the letter “T” in the superscript of the given matrix. For example, if “A” is the given matrix, then the transpose of the matrix is represented by A' or AT.
How about (docs):
>>> df.combine_first(df.T)
a b c d
a 1.0 0.5 0.6 0.5
b 0.5 1.0 0.0 0.4
c 0.6 0.0 1.0 0.3
d 0.5 0.4 0.3 1.0
Here is one alternative way:
>>> m[np.triu_indices_from(m, k=1)] = m.T[np.triu_indices_from(m, k=1)]
>>> m
array([[ 1. , 0.5, 0.6, 0.5],
[ 0.5, 1. , 0. , 0.4],
[ 0.6, 0. , 1. , 0.3],
[ 0.5, 0.4, 0.3, 1. ]])
m[np.triu_indices_from(m, k=1)]
returns the values above the diagonal of m
and assigns them to the values values above the diagonal of the transpose of m
.
With numpy.isnan()
:
>>> m[np.isnan(m)] = m.T[np.isnan(m)]
>>> m
a b c d
a 1.0 0.5 0.6 0.5
b 0.5 1.0 0.0 0.4
c 0.6 0.0 1.0 0.3
d 0.5 0.4 0.3 1.0
or better, with panda.isnull()
:
>>> m[pd.isnull(m)] = m.T[pd.isnull(m)]
>>> m
a b c d
a 1.0 0.5 0.6 0.5
b 0.5 1.0 0.0 0.4
c 0.6 0.0 1.0 0.3
d 0.5 0.4 0.3 1.0
which is finally equivalent to @DSM 's solution!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With