Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ranking columns and selecting column names

Tags:

python

pandas

If I had the table:

a   b   c
15  15  5
20  10  7
25  30  9

And wanted to do two things 1) Pick out the column with the highest value across the axis and assign it to a column 2) Take the value and assign it to another column, such as:

a   b   c   1st 1st_value   2nd 2nd_value   3rd 3rd_value
15  15  5   a/b 15  c   5   NaN NaN
20  10  7   a   20  b   10  c   7
25  30  9   b   30  a   25  c   9

Is this possible?

like image 754
DGraham Avatar asked Oct 30 '22 10:10

DGraham


2 Answers

I can suggest you to solve it like this:

import pandas as pd
import numpy as np

df = pd.DataFrame([{'a': 15, 'b': 15, 'c': 5}, {'a': 20, 'b': 10, 'c': 7}, {'a': 25, 'b': 30, 'c': 9}])
ext = {0: 'st', 1: 'nd', 2: 'rd'}
cols = df.columns


def make_ranking(row, rank=0, is_value=False):
    values = list(row[cols])
    sorted_values = sorted(set(values), reverse=True)
    value = sorted_values[rank] if len(sorted_values) > rank else np.nan
    if not is_value:
        items = [k for k, v in enumerate(values) if v == value]
        value = '/'.join([cols[item] for item in items]) or np.nan
    return value

for i in range(len(cols)):
    df[str(i+1)+ext[i]] = df.apply(make_ranking, args=(i, False, ), axis=1)
    df[str(i+1)+ext[i]+'_value'] = df.apply(make_ranking, args=(i, True, ), axis=1)

print(df)

Output:

    a   b  c  1st  1st_value 2nd  2nd_value  3rd  3rd_value
0  15  15  5  a/b         15   c          5  NaN        NaN
1  20  10  7    a         20   b         10    c          7
2  25  30  9    b         30   a         25    c          9
like image 133
George Petrov Avatar answered Nov 08 '22 08:11

George Petrov


df_sorted = df.apply(lambda row: sorted(set(row), reverse=True) + [None]*(len(row)-len(set(row))), axis=1)

>>> df_sorted
    a   b   c
0  15   5 NaN
1  20  10   7
2  30  25   9

Rename columns if you want:

df_sorted.rename(columns={'a': '1st_value', 'b': '2nd_value', 'c': '3rd_value'}, inplace=True)

>>> df_sorted
   1st_value  2nd_value  3rd_value
0         15          5        NaN
1         20         10          7
2         30         25          9

Concat original and sorted if you want:

df_concat = pd.concat([df, df_sorted], axis=1)

>>> df_concat
    a   b  c  1st_value  2nd_value  3rd_value
0  15  15  5         15          5        NaN
1  20  10  7         20         10          7
2  25  30  9         30         25          9
like image 22
vk1011 Avatar answered Nov 08 '22 09:11

vk1011