Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: pandas: match row value to column name/ key's value

Tags:

python

pandas

Problem
"How to match a row value to a column name and take that intersecting value in pandas"

Context
We have a pandas df like this:

df = pd.DataFrame([{'name': 'john', 'john': 1, 'mac': 10}, {'name': 'mac', 'john': 2, 'mac': 20}], columns=["name", "john", "mac"])

Looking like this:

name | john | mac
john |  1   | 10
mac  |  2   | 20


Desired output

name | john | mac  | value
john |  1   | 10   | 1
mac  |  2   | 20   | 20

In words, the column "value" should take the number from the corresponding column where name intersects.

So, if name == 'john', then take the value from column 'john'
So, if name == 'mac', then take the value from column 'mac'

Tried so far
Bunch of lambdas (none successful).

Specifications
Python: 3.5.2
Pandas: 0.18.1

like image 477
John Avatar asked Sep 16 '25 08:09

John


2 Answers

Deprecation Notice: lookup was deprecated in v1.2.0

You could use DataFrame.lookup, which accepts the row and column labels to use:

In [66]: df
Out[66]: 
   name  john  mac
0  john     1   10
1   mac     2   20

In [67]: df["value"] = df.lookup(df.index, df.name)

In [68]: df
Out[68]: 
   name  john  mac  value
0  john     1   10      1
1   mac     2   20     20

Note that this will have problems with duplicate row labels (which could be trivially worked around by adding a reset_index). It should be faster than calling apply, which can be pretty slow, but if your frames aren't too large both should work well enough.

like image 150
DSM Avatar answered Sep 19 '25 03:09

DSM


well imo lambda is the way to go, but you can make it very short such has:

df = pd.DataFrame([{'name': 'john', 'john': 5, 'mac': 10}, {'name': 'mac', 'john': 10, 'mac': 15}], columns=["name", "john", "mac"])
df = df.set_index('name')
df
Out[64]: 
      john  mac
name           
john     5   10
mac     10   15

df['values'] = df.apply(lambda x: x[x.name], axis=1)
In[68]: df
Out[68]: 
      john  mac  values
name                   
john     5   10       5
mac     10   15      15

I did set the index to name for convinience but you could do it without it such has:

df = pd.DataFrame([{'name': 'john', 'john': 5, 'mac': 10}, {'name': 'mac', 'john': 10, 'mac': 15}], columns=["name", "john", "mac"])
df['values'] = df.apply(lambda x: x[x['name']], axis=1)
df
Out[71]: 
   name  john  mac  values
0  john     5   10       5
1   mac    10   15      15
like image 41
Steven G Avatar answered Sep 19 '25 04:09

Steven G