group by pandas dataframe and select maximun value within sequence

Tags:

I have a pandas dataframe that represents elevation differences between points every 10 degrees for several target Turbines. I have selected the elevation differences that follow a criteria and I have added a column that represents if they are consecutive or not (metDegDiff = 10 represents consecutive points).

How can I select the maximum value of elevDif by targTurb in 3 or more consecutive 10 degree points?

ridgeDF2 = pd.DataFrame(data = {
    'MetID':['A06_40','A06_50','A06_60','A06_70','A06_80','A06_100','A06_110','A06_140','A07_110','A07_130','A07_140','A08_100','A08_110','A08_120','A08_130','A08_220'],
    'targTurb':['A06','A06','A06','A06','A06','A06','A06','A06','A07','A07','A07','A08','A08','A08','A08','A08'],
    'metDeg':[30,50,60,70,80,100,110,140,110,130,140,100,110,120,130,220],
    'elevDif':[1.433234, 1.602997,3.227997,2.002991,2.414001,2.96402,1.513,1.793976,1.612,2.429993,1.639008,1.500977,3.048004,2.174011,1.813995,1.527008],
    'metDegDiff':[20,10,10,10,10,20,10,30,-30,20,10,-40,10,10,10,30]})

[Dbg]>>> ridgeDF2
      MetID targTurb  metDeg   elevDif  metDegDiff
0    A06_40      A06      30  1.433234          20
1    A06_50      A06      50  1.602997          10
2    A06_60      A06      60  3.227997          10
3    A06_70      A06      70  2.002991          10
4    A06_80      A06      80  2.414001          10
5   A06_100      A06     100  2.964020          20
6   A06_110      A06     110  1.513000          10
7   A06_140      A06     140  1.793976          30
8   A07_110      A07     110  1.612000         -30
9   A07_130      A07     130  2.429993          20
10  A07_140      A07     140  1.639008          10
11  A08_100      A08     100  1.500977         -40
12  A08_110      A08     110  3.048004          10
13  A08_120      A08     120  2.174011          10
14  A08_130      A08     130  1.813995          10
15  A08_220      A08     220  1.527008          30

In the example, for A06 there are 4 rows that have consecutive 10 metDeg values (rows 1, 2, 3, and 4) and for A8 there are 3 rows (rows 12, 13 and 14). Note that those 2 series have a length of 3 or more.

So, the output would be the maximum elevDif inside those two selected series. Like this:

MetID  targTurb  metDeg   elevDif  metDegDiff
A06_60      A06      60  3.227997          10
A08_110     A08     110  3.048004          10

803

asked Jul 06 '20 20:07

casti

1 Answers

The code below should work. You can run each line separately to see what is happening.

ridgeDF2['t/f'] = ridgeDF2['metDegDiff'] != 10
ridgeDF2['t/f'] = ridgeDF2['t/f'].shift().fillna(0).cumsum()
ridgeDF2['count'] = ridgeDF2.groupby('t/f')['t/f'].transform(len)-1
ridgeDF2['count'] = np.where(ridgeDF2['count'] >= 3,True,False)
ridgeDF2.loc[ridgeDF2['metDegDiff'] != 10,'count'] = False
highest = ridgeDF2.loc[ridgeDF2['count'] == True]
highest = highest.loc[highest.groupby(['targTurb','metDegDiff','t/f'])['elevDif'].idxmax()]
highest.drop(columns = ['t/f','count'])

answered Oct 31 '22 14:10

rhug123

Related questions
                            
                                How can I trim / remove part of a Tensor to match the shape of another Tensor with PyTorch?
                            
                                ansible not install perfectly using "brew install ansible" command not work in MacOS ? error: -sh: /usr/local/bin/ansible: No such file or directory
                            
                                Silencing SQLAlchemy warnings
                            
                                How is polymophism working in Python if parent constructor is not invoked (unlike Java)?
                            
                                numba-safe version of itertools.combinations?
                            
                                How to change the color of the interactive zoom rectangle?
                            
                                Creating a standalone macOS application with Python and py2app
                            
                                bson.errors.InvalidDocument: key '$numberDecimal' must not start with '$' when using json
                            
                                Clean Docker pip install results in ERROR: THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE
                            
                                How to use TPUs with PyTorch?
                            
                                pd.read_feather problems with decimal / thousands separator and rounding problems for floats
                            
                                What kind of objects `yield from` can be used with?
                            
                                PyGame slower on macOS than on Ubuntu or Raspbian
                            
                                What is the best practice for keeping Kafka consumer alive in python?
                            
                                How to use regex to extract text in order?
                            
                                read only particular json files from s3 buckets from multiple folders
                            
                                Unable to install sklearn when building docker image
                            
                                Open Specific Event logs using win32evtlog Python
                            
                                How to tell pip that a package(opencv) has been compiled from source
                            
                                How to make a python context manager catch a SIGINT or SIGTERM signal

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

group by pandas dataframe and select maximun value within sequence

Tags:

python

pandas

pandas-groupby

casti

People also ask

1 Answers

rhug123

Recent Activity

Donate For Us