Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Column Pair-wise reorganization in Pandas

Tags:

python

pandas

I have my data like this

df = pd.DataFrame([[2,1,3,3],[2,3,2,4],[4,1,3,2]],columns=['A1','A2','B1','B2'])

df
   A1   A2   B1   B2
0  A:2  A:1  B:3  B:3
1  A:2  A:3  B:2  B:4
2  A:4  A:1  B:3  B:2

the value in A1, A2 is one pair, same with B1 and B2.

Now I want to reorganize each pair so they are in alphabet order:

df
   A1   A2   B1   B2
0  A:1  A:2  B:3  B:3
1  A:2  A:3  B:2  B:4
2  A:1  A:4  B:2  B:3

I can do this with a for loop for each pair, sort, and then reparse it into the pandas frame:

for index, row_ in df.iterrows():
    for pair_ in range(int(len(row_)/2)):
        print(index, pair_)
        pair = row_[pair_*2:(pair_*2+2)]
        df.iloc[index, pair_*2:(pair_*2+2)] = pair.sort_values()

but this seems to be very inefficient.

Please suggest me better approach on this, thank you

like image 680
Thanh Nguyen Avatar asked Mar 03 '23 14:03

Thanh Nguyen


1 Answers

I would use np.sort:

# replace with your number
num_col_in_group = 2
pd.DataFrame(np.sort(df.values.reshape(len(df), -1, num_col_in_group), 
                     axis=-1).reshape(len(df),-1), 
             columns=df.columns)

Output:

    A1   A2   B1   B2
0  A:1  A:2  B:3  B:3
1  A:2  A:3  B:2  B:4
2  A:1  A:4  B:2  B:3
like image 169
Quang Hoang Avatar answered Mar 05 '23 15:03

Quang Hoang