Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating distance to a row with a certain value

Tags:

python

pandas

I am working on a data with pandas in which a maintenance work is done at a location. The maintenance is done every four years at each site. I want to find the years since the last maintenance action at each site. I am giving here only two sites in the following example but in the original dataset, I have thousands of them. My data only covers the years 2014 through 2017.

Action = 0 means no action has been performed that year, Action = 1 means some action has been done. Measurement is a performance reading related to the effect of the action. The action can happen in any year. I know that if the action has been performed in Year Y, the previous maintenance has been performed in Year Y-4.

 Site  Year   Action  Measurement
   A   2014     0         100
   A   2015     0         150
   A   2016     1         300
   A   2017     0         80
   B   2014     0         200
   B   2015     1         250
   B   2016     0         60
   B   2017     0         110

Given this dataset; first, I want to have a temporary dataset like this:

 Item  Year   Action  Measurement  Years_Since_Last_Action
   A   2014     0         100           2
   A   2015     0         150           3
   A   2016     1         300           4
   A   2017     0         80            1
   B   2014     0         200           3
   B   2015     1         250           4
   B   2016     0         60            1
   B   2017     0         110           2

Then, I want to have:

Years_Since_Last_Action         Mean_Measurement
        1                            70
        2                            105
        3                            175
        4                            275

Thanks in advance!

like image 578
azuber Avatar asked Jan 01 '23 20:01

azuber


2 Answers

Your first question

s=df.loc[df.Action==1,['Site','Year']].set_index('Site') # get all year have the action and map back to the whole dataframe
df['Newyear']=df.Site.map(s.Year)
s1=df.Year-df.Newyear
df['action since last year']=np.where(s1<=0,s1+4,s1)# using np.where get the condition
df
Out[167]: 
  Site  Year  Action  Measurement  Newyear  action since last year
0    A  2014       0          100     2016                       2
1    A  2015       0          150     2016                       3
2    A  2016       1          300     2016                       4
3    A  2017       0           80     2016                       1
4    B  2014       0          200     2015                       3
5    B  2015       1          250     2015                       4
6    B  2016       0           60     2015                       1
7    B  2017       0          110     2015                       2

2nd question

df.groupby('action since last year').Measurement.mean()
Out[168]: 
action since last year
1     70
2    105
3    175
4    275
Name: Measurement, dtype: int64
like image 167
BENY Avatar answered Jan 12 '23 13:01

BENY


First, build your intermediate using groupby, *fill and a little arithmetic.

v = (df.Year
       .where(df.Action.astype(bool))
       .groupby(df.Site)
       .ffill()
       .bfill()
       .sub(df.Year))
df['Years_Since_Last_Action'] = np.select([v > 0, v < 0], [4 - v, v.abs()], default=4)

df
  Site  Year  Action  Measurement  Years_Since_Last_Action
0    A  2014       0          100                      2.0
1    A  2015       0          150                      3.0
2    A  2016       1          300                      4.0
3    A  2017       0           80                      1.0
4    B  2014       0          200                      3.0
5    B  2015       1          250                      4.0
6    B  2016       0           60                      1.0
7    B  2017       0          110                      2.0

Next,

df.groupby('Years_Since_Last_Action', as_index=False).Measurement.mean()

   Years_Since_Last_Action  Measurement
0                      1.0           70
1                      2.0          105
2                      3.0          175
3                      4.0          275
like image 22
cs95 Avatar answered Jan 12 '23 13:01

cs95