I have a pandas dataframe that represents elevation differences between points every 10 degrees for several target Turbines. I have selected the elevation differences that follow a criteria and I have added a column that represents if they are consecutive or not (metDegDiff = 10 represents consecutive points).
How can I select the maximum value of elevDif by targTurb in 3 or more consecutive 10 degree points?
ridgeDF2 = pd.DataFrame(data = {
    'MetID':['A06_40','A06_50','A06_60','A06_70','A06_80','A06_100','A06_110','A06_140','A07_110','A07_130','A07_140','A08_100','A08_110','A08_120','A08_130','A08_220'],
    'targTurb':['A06','A06','A06','A06','A06','A06','A06','A06','A07','A07','A07','A08','A08','A08','A08','A08'],
    'metDeg':[30,50,60,70,80,100,110,140,110,130,140,100,110,120,130,220],
    'elevDif':[1.433234, 1.602997,3.227997,2.002991,2.414001,2.96402,1.513,1.793976,1.612,2.429993,1.639008,1.500977,3.048004,2.174011,1.813995,1.527008],
    'metDegDiff':[20,10,10,10,10,20,10,30,-30,20,10,-40,10,10,10,30]})
[Dbg]>>> ridgeDF2
      MetID targTurb  metDeg   elevDif  metDegDiff
0    A06_40      A06      30  1.433234          20
1    A06_50      A06      50  1.602997          10
2    A06_60      A06      60  3.227997          10
3    A06_70      A06      70  2.002991          10
4    A06_80      A06      80  2.414001          10
5   A06_100      A06     100  2.964020          20
6   A06_110      A06     110  1.513000          10
7   A06_140      A06     140  1.793976          30
8   A07_110      A07     110  1.612000         -30
9   A07_130      A07     130  2.429993          20
10  A07_140      A07     140  1.639008          10
11  A08_100      A08     100  1.500977         -40
12  A08_110      A08     110  3.048004          10
13  A08_120      A08     120  2.174011          10
14  A08_130      A08     130  1.813995          10
15  A08_220      A08     220  1.527008          30
In the example, for A06 there are 4 rows that have consecutive 10 metDeg values (rows 1, 2, 3, and 4) and for A8 there are 3 rows (rows 12, 13 and 14). Note that those 2 series have a length of 3 or more.
So, the output would be the maximum elevDif inside those two selected series. Like this:
MetID  targTurb  metDeg   elevDif  metDegDiff
A06_60      A06      60  3.227997          10
A08_110     A08     110  3.048004          10
                In the pandas series constructor, there is a method called argmax() which is used to get the position of maximum value over the series data. The pandas series is a single-dimensional data structure object with row index values. By using row index values we can access the data.
Get max value from a row of a Dataframe in Python For the maximum value of each row, call the max() method on the Dataframe object with an argument axis=1. In the output, we can see that it returned a series of maximum values where the index is the row name and values are the maxima from each row.
The code below should work. You can run each line separately to see what is happening.
ridgeDF2['t/f'] = ridgeDF2['metDegDiff'] != 10
ridgeDF2['t/f'] = ridgeDF2['t/f'].shift().fillna(0).cumsum()
ridgeDF2['count'] = ridgeDF2.groupby('t/f')['t/f'].transform(len)-1
ridgeDF2['count'] = np.where(ridgeDF2['count'] >= 3,True,False)
ridgeDF2.loc[ridgeDF2['metDegDiff'] != 10,'count'] = False
highest = ridgeDF2.loc[ridgeDF2['count'] == True]
highest = highest.loc[highest.groupby(['targTurb','metDegDiff','t/f'])['elevDif'].idxmax()]
highest.drop(columns = ['t/f','count'])
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With