Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas Lookup to be deprecated - elegant and efficient alternative

The Pandas lookup function is to be deprecated in a future version. As suggested by the warning, it is recommended to use .melt and .loc as an alternative.

df = pd.DataFrame({'B': ['X', 'X' , 'Y', 'X', 'Y', 'Y', 
                         'X', 'X', 'Y', 'Y', 'X', 'Y'],
                   'group': ["IT", "IT", "IT", "MV", "MV", "MV", 
                             "IT", "MV", "MV", "IT", "IT", "MV"]})

a = (pd.concat([df, df['B'].str.get_dummies()], axis=1)
     .groupby('group').rolling(3, min_periods=1).sum()
     .sort_index(level=1).reset_index(drop=True))        

df['count'] = a.lookup(df.index, df['B'])

>  Output Warning:  <ipython-input-16-e5b517460c82>:7: FutureWarning:
> The 'lookup' method is deprecated and will be  removed in a future
> version. You can use DataFrame.melt and DataFrame.loc as a substitute.

However, the alternative appears to be less elegant and slower:

b = pd.melt(a, value_vars=a.columns, var_name='B', ignore_index=False)
b.index.name='index'
df.index.name='index'
df = df.merge(b, on=['index','B'])

Is there a more elegant and more efficient approach to this?

like image 701
nrcjea001 Avatar asked Jan 25 '21 09:01

nrcjea001


2 Answers

It looks like, you can just use the index to assign new values.

dfn = df.set_index('B', append=True)
dfn['count'] = a.stack()
like image 54
Ferris Avatar answered Nov 08 '22 09:11

Ferris


One idea is use DataFrame.stack with DataFrame.joinf for match by index and B:

df1 = df.rename_axis('i').join(a.stack().rename('count'), on=['i','B'])
print (df1)
    B group  count
i                 
0   X    IT    1.0
1   X    IT    2.0
2   Y    IT    1.0
3   X    MV    1.0
4   Y    MV    1.0
5   Y    MV    2.0
6   X    IT    2.0
7   X    MV    1.0
8   Y    MV    2.0
9   Y    IT    2.0
10  X    IT    2.0
11  Y    MV    2.0
like image 1
jezrael Avatar answered Nov 08 '22 07:11

jezrael