Get Column Names Sorted by their Values in a DataFrame

Question

I have a huge dataframe for which I would like to create a dictionary. The keys of the dictionary will be the indices of the row, and the values will be lists of column names of the dataframe sorted by the values in that row (descending order). Consider an example below:

df=      23    45    12     3     6
    45   0.2   1     0.12   0.5   0.1
    12   0.5   0.2   1      0.3   0.9
    23   0.1   0.9   0.3    1     0.5

I would like to create a dictionary in the following form:

dict={ '45':['45','3','23','12','6'], 
       '12':['12','6','23','3','45'], 
       '23':['3','45','6','23']}

where the values are column names sorted by their values in that row. I tried the following:

for idx,row in df.iteritems():
    l = row.values.tolist()
    l.sort(reverse=True)
    print idx,l

but this gives me the values and not the column names sorted in descending order. Any help on how I can produce the desired result will be appreciated. Thanks.

Ami Tavory · Accepted Answer

Well, this seems to work:

import numpy as np

df = pd.DataFrame({'A': [1, 3, 10, 50], 'B': [2, -8, 3, 7], 'C': [1, 10, -20, 1]})

>>> dict([(r[0], list(df.columns[np.argsort(list(r)[1: ])])) \
    for r in list(df.to_records())])
{0: ['A', 'C', 'B'],
 1: ['B', 'A', 'C'],
 2: ['C', 'B', 'A'],
 3: ['C', 'B', 'A']}

Explanation:

list(df.to_records()) is a list of rows as tuples.
r[0] is the first element in the tuple.
list(r)[1: ] is the rest of the tuple.
np.argsort returns the indices of an array according to the sorted value order.
dict(list_of_pairs) creates a dictionary from an array of pairs.

Get Column Names Sorted by their Values in a DataFrame

Tags:

python

dictionary

sorting

dataframe

BajajG

1 Answers

Ami Tavory

Recent Activity

Donate For Us

Get Column Names Sorted by their Values in a DataFrame

Tags:

python

dictionary

sorting

dataframe

BajajG

1 Answers

Ami Tavory

Related questions

Recent Activity

Donate For Us