Pandas lookup from one of multiple columns, based on value

Question

I have the following DataFrame:

Date    best    a    b    c    d
1990    a       5    4    7    2
1991    c       10   1    2    0
1992    d       2    1    4    12
1993    a       5    8    11   6

I would like to make a dataframe as follows:

Date    best    value   
1990    a       5
1991    c       2
1992    d       12
1993    a       5

So I am looking to find a value based on another row value by using column names. For instance, the value for 1990 in the second df should lookup "a" from the first df and the second row should lookup "c" (=2) from the first df.

Any ideas?

chrisb · Accepted Answer

There is a built in lookup function that can handle this type of situation (looks up by row/column). I don't know how optimized it is, but may be faster than the apply solution.

In [9]: df['value'] = df.lookup(df.index, df['best'])

In [10]: df
Out[10]: 
   Date best   a  b   c   d  value
0  1990    a   5  4   7   2      5
1  1991    c  10  1   2   0      2
2  1992    d   2  1   4  12     12
3  1993    a   5  8  11   6      5

EdChum · Answer

You create a lookup function and call apply on your dataframe row-wise, this isn't very efficient for large dfs though

In [245]:

def lookup(x):
    return x[x.best]
df['value'] = df.apply(lambda row: lookup(row), axis=1)
df
Out[245]:
   Date best   a  b   c   d  value
0  1990    a   5  4   7   2      5
1  1991    c  10  1   2   0      2
2  1992    d   2  1   4  12     12
3  1993    a   5  8  11   6      5

Dev Khadka · Answer

You can do this using np.where like below. I think it will be more efficient

import numpy as np
import pandas as pd

df = pd.DataFrame([['1990', 'a', 5, 4, 7, 2], ['1991', 'c', 10, 1, 2, 0], ['1992', 'd', 2, 1, 4, 12], ['1993', 'a', 5, 8, 11, 6]], columns=('Date', 'best', 'a', 'b', 'c', 'd'))
arr = df.best.values

cols = df.columns[2:]
for col in cols:
    arr2 = df[col].values
    arr = np.where(arr==col, arr2, arr)

df.drop(columns=cols, inplace=True)
df["values"] = arr
df

Result

Date    best    values
0   1990    a   5
1   1991    c   2
2   1992    d   12
3   1993    a   5

Pandas lookup from one of multiple columns, based on value

Tags:

python

pandas

dataframe

numpy

AtotheSiv

3 Answers

chrisb

EdChum

Dev Khadka

Recent Activity

Donate For Us

Pandas lookup from one of multiple columns, based on value

Tags:

python

pandas

dataframe

numpy

AtotheSiv

3 Answers

chrisb

EdChum

Dev Khadka

Related questions

Recent Activity

Donate For Us