Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

extracting values from dataframe1 using conditions set in dataframe2 (pandas, python)

I have two dateframe (df1 & df2), i'm trying to figure out how to use conditions from df2 to extract values from df1 and use the extracted values in df2.

df1 = values to exact from

df2 = conditions for exaction and df where the extracted values are used

conditions: df2.HJ = df1HJ & df2.JK = df1 P colum

example if df2(df2.HJ = 99 & df2.JK = P3); Ans = 67 (from df1)

df1

╔════╦════╦══════╦══════╦══════╦══════╗
║ HJ ║ P1 ║  P2  ║  P3  ║  P4  ║  P5  ║
╠════╬════╬══════╬══════╬══════╬══════╣
║  5 ║ 51 ║  33  ║  21  ║  31  ║  13  ║
║ 11 ║ 66 ║  45  ║  21  ║  49  ║  58  ║
║ 21 ║  7 ║  55  ║  56  ║  67  ║  73  ║
║ 99 ║  0 ║  76  ║  67  ║  98  ║  29  ║
║ 15 ║ 11 ║  42  ║  79  ║  27  ║  54  ║
╚════╩════╩══════╩══════╩══════╩══════╝

df2

╔════╦════╗
║ HJ ║ JK ║
╠════╬════╣
║ 99 ║ P1 ║
║ 11 ║ P5 ║
║  5 ║ P3 ║
║ 21 ║ P2 ║
║ 11 ║ P3 ║
╚════╩════╝

expected result for df2 after exaction from df1

╔════╦════╦═══════╗
║ HJ ║ JK ║  Ans  ║
╠════╬════╬═══════╣
║ 99 ║ P1 ║    0  ║
║ 11 ║ P5 ║   58  ║
║  5 ║ P3 ║   21  ║
║ 21 ║ P2 ║   55  ║
║ 11 ║ P3 ║   21  ║
╚════╩════╩═══════╝

code for df1

import pandas as pd
import numpy as np
data = {'HJ':[5,11,21,99,15],
'P1':[51,66,7,0,11]
,'P2':[ 33,45,55 ,76 ,42]
,'P3':[ 21 ,21 ,56 ,67 ,79]
,'P4':[ 31 ,49 ,67 ,98 ,27]
,'P5':[ 13 ,58 ,73 ,29 ,54]}
df1 = pd.DataFrame(data)

code for df2

data = {'HJ':[99,11,5,21,11],
'JK':['P1','P5','P3','P2','P3']}
df2 = pd.DataFrame(data)

Regards Thank you

===========

Update

@Scott Boston's solution works:

df2['ans'] = df1.set_index('HJ').lookup(df2['HJ'], df2['JK']) 

However, a KeyError: 'One or more row labels was not found' appears when there is/are labels not found. Is there any way to overcome this problem?

like image 729
ManOnTheMoon Avatar asked Jun 09 '20 16:06

ManOnTheMoon


People also ask

How do I extract a value from a column in pandas?

You can extract a column of pandas DataFrame based on another value by using the DataFrame. query() method. The query() is used to query the columns of a DataFrame with a boolean expression.

How do you select a specific value in a DataFrame?

Select Data Using Location Index (. This means that you can use dataframe. iloc[0:1, 0:1] to select the cell value at the intersection of the first row and first column of the dataframe. You can expand the range for either the row index or column index to select more data.


1 Answers

Use pd.DataFrame.lookup after set_index:

df2['ans'] = df1.set_index('HJ').lookup(df2['HJ'], df2['JK'])
print(df2)

Output:

   HJ  JK  ans
0  99  P1    0
1  11  P5   58
2   5  P3   21
3  21  P2   55
4  11  P3   21

Using lookup, you have to filter your inputs to lookup first:

df2m = df2[df2['HJ'].isin(df1['HJ']) & df2['JK'].isin(df1.columns)].copy()

df2m['ans'] = df1.set_index('HJ').lookup(df2m['HJ'],df2m['JK'])

df2.update(df2m)

df2m.combine_first(df2)
like image 80
Scott Boston Avatar answered Oct 07 '22 14:10

Scott Boston