I have the following data in form of pandas data frame (df):
Each column name has starting and ending nodes separated by '_'. I.e, 'A_B' means a starting node of A and ending node of B
A_A A_B A_C B_B B_A B_C C_C C_A C_B
12 23 33 43 19 98 44 77 35
and I want to convert it into a a 2D matrix that gets output to a csv so that it looks like:
A B C
A 12 23 33
B 19 43 98
C 77 35 44
To do this, assume that we have
from itertools import groupby, combinations_with_replacement
col_names = ['A','B','C']
for i in combinations_with_replacement(col_names, 2):
get_val = df[i[0]+'_'+i[1]]
Is there a way in pandas to output this as a matrix and also print out row and column names?
I'm not sure if this was all done as elegantly as possible, but:
df = pd.read_clipboard()
df
Out[3]:
A_A A_B A_C B_B B_A B_C C_C C_A C_B
0 12 23 33 43 19 98 44 77 35
dfT = df.transpose().reset_index()
dfT
Out[8]:
index 0
0 A_A 12
1 A_B 23
2 A_C 33
3 B_B 43
4 B_A 19
5 B_C 98
6 C_C 44
7 C_A 77
8 C_B 35
dfT['col_name'] = dfT['index'].str.split('_').map(lambda x: x[1])
dfT['row_name'] = dfT['index'].str.split('_').map(lambda x: x[0])
dfT.pivot(index='row_name', columns='col_name', values=0)
Out[17]:
col_name A B C
row_name
A 12 23 33
B 19 43 98
C 77 35 44
And to drop the index names:
final_mat = dfT.pivot(index='row_name', columns='col_name', values=0)
final_mat
Out[19]:
col_name A B C
row_name
A 12 23 33
B 19 43 98
C 77 35 44
final_mat.index.name = None
final_mat.columns.name = None
final_mat
Out[22]:
A B C
A 12 23 33
B 19 43 98
C 77 35 44
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With