Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculate weights for grouped data in pandas

Tags:

python

pandas

I would like to calculate portfolio weights with a pandas dataframe. Here is some dummy data for an example:

df1 = DataFrame({'name' : ['ann','bob']*3}).sort('name').reset_index(drop=True)
df2 = DataFrame({'stock' : list('ABC')*2})
df3 = DataFrame({'val': np.random.randint(10,100,6)})
df = pd.concat([df1, df2, df3], axis=1)

enter image description here

Each person owns 3 stocks with a value val. We can calculate portfolio weights like this:

df.groupby('name').apply(lambda x: x.val/(x.val).sum())

which gives this:

enter image description here

If I want to add a column wgt to df I need to merge this result back to df on name and index. This seems rather clunky.

Is there a way to do this in one step? Or what is the way to do this that best utilizes pandas features?

like image 222
itzy Avatar asked Sep 27 '22 03:09

itzy


1 Answers

Use transform, this will return a series with an index aligned to your original df:

In [114]:
df['wgt'] = df.groupby('name')['val'].transform(lambda x: x/x.sum())
df

Out[114]:
  name stock  val       wgt
0  ann     A   18  0.131387
1  ann     B   43  0.313869
2  ann     C   76  0.554745
3  bob     A   16  0.142857
4  bob     B   44  0.392857
5  bob     C   52  0.464286
like image 65
EdChum Avatar answered Oct 13 '22 00:10

EdChum