I would like to calculate portfolio weights with a pandas dataframe. Here is some dummy data for an example:
df1 = DataFrame({'name' : ['ann','bob']*3}).sort('name').reset_index(drop=True)
df2 = DataFrame({'stock' : list('ABC')*2})
df3 = DataFrame({'val': np.random.randint(10,100,6)})
df = pd.concat([df1, df2, df3], axis=1)
Each person owns 3 stocks with a value val
. We can calculate portfolio weights like this:
df.groupby('name').apply(lambda x: x.val/(x.val).sum())
which gives this:
If I want to add a column wgt
to df
I need to merge this result back to df
on name
and index
. This seems rather clunky.
Is there a way to do this in one step? Or what is the way to do this that best utilizes pandas features?
Use transform
, this will return a series with an index aligned to your original df:
In [114]:
df['wgt'] = df.groupby('name')['val'].transform(lambda x: x/x.sum())
df
Out[114]:
name stock val wgt
0 ann A 18 0.131387
1 ann B 43 0.313869
2 ann C 76 0.554745
3 bob A 16 0.142857
4 bob B 44 0.392857
5 bob C 52 0.464286
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With