Pandas: assigning columns with multiple conditions and date thresholds

Question

Edited:

I have a financial portfolio in a pandas dataframe df, where the index is the date and I have multiple financial stocks per date.

Eg dataframe:

Date    Stock   Weight  Percentile  Final weight
1/1/2000    Apple   0.010   0.75    0.010
1/1/2000    IBM    0.011    0.4     0
1/1/2000    Google  0.012   0.45    0
1/1/2000    Nokia   0.022   0.81    0.022
2/1/2000    Apple   0.014   0.56    0
2/1/2000    Google  0.015   0.45    0
2/1/2000    Nokia   0.016   0.55    0
3/1/2000    Apple   0.020   0.52    0
3/1/2000    Google  0.030   0.51    0
3/1/2000    Nokia   0.040   0.47    0

I created Final_weight by doing assigning values of Weight whenever Percentile is greater than 0.7

Now I want this to be a bit more sophisticated, I still want Weight to be assigned to Final_weight when Percentile is > 0.7, however after this date (at any point in the future), rather than become 0 when a stocks Percentile is not >0.7, we would still get a weight as long as the Stocks Percentile is above 0.5 (ie holding the position for longer than just one day).

Then if the stock goes below 0.5 (in the near future) then Final_weight would become 0.

Eg modified dataframe from above:

Date    Stock   Weight  Percentile  Final weight
1/1/2000    Apple   0.010   0.75    0.010
1/1/2000    IBM     0.011   0.4     0
1/1/2000    Google  0.012   0.45    0
1/1/2000    Nokia   0.022   0.81    0.022
2/1/2000    Apple   0.014   0.56    0.014
2/1/2000    Google  0.015   0.45    0
2/1/2000    Nokia   0.016   0.55    0.016
3/1/2000    Apple   0.020   0.52    0.020
3/1/2000    Google  0.030   0.51    0
3/1/2000    Nokia   0.040   0.47    0

Everyday the portfolios are different not always have the same stock from the day before.

cronos · Accepted Answer

This solution is more explicit and less pandas-esque, but it involves only a single pass through all rows without creating tons of temporary columns, and is therefore possibly faster. It needs an additional state variable, which I wrapped it into a closure for not having to make a class.

def closure():
    cur_weight = {}
    def func(x):
        if x["Percentile"] > 0.7:
            next_weight = x["Weight"]
        elif x["Percentile"] < 0.5 :
            next_weight = 0
        else:
            next_weight = x["Weight"] if cur_weight.get(x["Stock"], 0) > 0 else 0
        cur_weight[x["Stock"]] = next_weight
        return next_weight
    return func

df["FinalWeight"] = df.apply(closure(), axis=1)

piRSquared · Answer

I'd first put 'Stock' into the index
Then unstack to put them into the columns
I'd then split w for weights and p for percentiles
Then manipulate with a series of where

d1 = df.set_index('Stock', append=True)

d2 = d1.unstack()

w, p = d2.Weight, d2.Percentile

d1.join(w.where(p > .7, w.where((p.shift() > .7) & (p > .5), 0)).stack().rename('Final Weight'))

                   Weight  Percentile  Final Weight
Date       Stock                                   
2000-01-01 Apple    0.010        0.75         0.010
           IBM      0.011        0.40         0.000
           Google   0.012        0.45         0.000
           Nokia    0.022        0.81         0.022
2000-02-01 Apple    0.014        0.56         0.014
           Google   0.015        0.45         0.000
           Nokia    0.016        0.55         0.016

Pandas: assigning columns with multiple conditions and date thresholds

Tags:

python

pandas

dataframe

finance

portfolio

MysterioProgrammer91

Video Answer

2 Answers

cronos

piRSquared

Recent Activity

Donate For Us

Pandas: assigning columns with multiple conditions and date thresholds

Tags:

python

pandas

dataframe

finance

portfolio

MysterioProgrammer91

Video Answer

2 Answers

cronos

piRSquared

Related questions

Recent Activity

Donate For Us