Find value greater than level - Python Pandas

Tags:

python

pandas

In a time series (ordered tuples), what's the most efficient way to find the first time a criterion is met?

In particular, what's the most efficient way to determine when a value goes over 100 for the value of a column in a pandas data frame?

I was hoping for a clever vectorized solution, and not having to use df.iterrows().

For example, for price or count data, when a value exceeds 100. I.e. df['col'] > 100.

              price
date 
2005-01-01     98
2005-01-02     99
2005-01-03     100
2005-01-04     99
2005-01-05     98
2005-01-06     100
2005-01-07     100
2005-01-08     98

but for potentially very large series. Is it better to iterate (slow) or is there a vectorized solution?

A df.iterrows() solution could be:

for row, ind in df.iterrows():
    if row['col'] > value_to_check:
        breakpoint = row['value_to_record'].loc[ind]
        return breakpoint
return None

But my question is more about efficiency (potentially, a vectorized solution that will scale well).

375

asked Aug 10 '16 00:08

Jared

1 Answers

Try this: "> 99"

df[df['price'].gt(99)].index[0]

returns "2", the second index row.

all row indexes greater than 99

df[df['price'].gt(99)].index
Int64Index([2, 5, 6], dtype='int64')

120

answered Nov 15 '22 22:11

Merlin

Related questions
                            
                                unconverted data remains: 15 [closed]
                            
                                Passing results to depending on job - python rq
                            
                                Convert image to specific palette using PIL without dithering
                            
                                Pyspark py4j PickleException: "expected zero arguments for construction of ClassDict"
                            
                                How to Install Private Python Package as Part of Build
                            
                                Sum rows where value equal in column
                            
                                Python code for counting number of zero crossings in an array
                            
                                How to find which points intersect with a polygon in geopandas?
                            
                                How to plot a smooth 2D color plot for z = f(x, y)
                            
                                How to filter a pandas dataframe by cells that DO NOT contain a substring?
                            
                                How can I get pandas Timestamp offset by certain amount of months?
                            
                                Matplotlib: how to show a figure that has been closed
                            
                                Django rest framework Router - how to add customized URL and view functions
                            
                                Create a dictionary of dataframes
                            
                                Rounding down values in Pandas dataframe column with NaNs
                            
                                Django: conditional expression
                            
                                How to let pyenv to find installed python versions
                            
                                How to interpret 4 bytes as a 32-bit float using Python
                            
                                ImportError: No module named 'pandas'
                            
                                Converting Tuple of integers and strings to just a string

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With