Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using column header and values from one dataframe to find weights in another dataframe

I have two tables that look like the following

    ID  param1  param2  param3
0  A12       2       1       1
1  B15       1       2       1
2  B20       2       2       1
...

and

    parameter  value  weight
0      param1      1      10
1      param1      2      13
2      param2      1      21
3      param2      2      39
4      param3      1      49
5      param3      2      61

What's the best way to take the first dataframe, lookup each parameter's weight in the second dataframe and return a dataframe like the following?

    ID  param1  param2  param3
0  A12      13      21      49
1  B15      10      39      49
2  B20      13      39      49

What I was thinking was to write a function given the parameter, and value, subset table2 like the following, table2[(table2['parameter'] = parameter) & (table2['value'] = value)] and doing some kind of vectorized application to each column in table1, but Im not sure how to go over each value and do the lookup.

like image 726
SharpObject Avatar asked Oct 26 '16 21:10

SharpObject


2 Answers

One option would be to transform table1 to long format, merge with table2 on parameters and values, and then transform it back to wide format:

In [85]: pd.merge(pd.melt(df1, id_vars='ID'), df2,
                  left_on=['variable', 'value'], right_on=['parameter', 'value']
                   ).pivot('ID', 'parameter', 'weight')
Out[85]:
parameter  param1  param2  param3
ID
A12            13      21      49
B15            10      39      49
B20            13      39      49
like image 86
Psidom Avatar answered Oct 27 '22 10:10

Psidom


A bunch of thrashing about and I came up with something far inferior to @Psidom. I only convinced myself to post to provide some insight into getting at the same solution with varying techniques.

In [55]: (df1.set_index('ID')
             .rename_axis('parameter', 1)
             .stack()
             .reset_index(name='value')
             .merge(df2)
             .set_index(['ID', 'parameter']).weight.unstack())
Out[55]:
parameter  param1  param2  param3
ID
A12            13      21      49
B15            10      39      49
B20            13      39      49
like image 26
piRSquared Avatar answered Oct 27 '22 09:10

piRSquared