So I created this post regarding my problem 2 days ago and got an answer thankfully. I have a data made of 20 rows and 2500 columns. Each column is a unique product and rows are time series, results of measurements. Therefore each product is measured 20 times and there are 2500 products. This time I want to know for how many consecutive rows my measurement result can stay above a specific threshold. AKA: I want to count the number of consecutive values that is above a value, let's say 5. A = [1, 2, 6, 8, 7, 3, 2, 3, 6, 10, 2, 1, 0, 2] We have these values in bold and according to what I defined above, I should get NumofConsFeature = 3 as the result. (Getting the max if there are more than 1 series that meets the condition) I thought of filtering using .gt, then getting the indexes and using a loop afterwards in order to detect the consecutive index numbers but couldn't make it work. In 2nd phase, I'd like to know the index of the first value of my consecutive series. For the above example, that would be 3. But I have no idea of how for this one. Thanks in advance.

Here's another answer using only Pandas functions: <pre class="prettyprint"><code>A = [1, 2, 6, 8, 7, 3, 2, 3, 6, 10, 2, 1, 0, 2] a = pd.DataFrame(A, columns = ['foo']) a['is_large'] = (a.foo > 5) a['crossing'] = (a.is_large != a.is_large.shift()).cumsum() a['count'] = a.groupby(['is_large', 'crossing']).cumcount(ascending=False) + 1 a.loc[a.is_large == False, 'count'] = 0 </code></pre> which gives <pre class="prettyprint"><code> foo is_large crossing count 0 1 False 1 0 1 2 False 1 0 2 6 True 2 3 3 8 True 2 2 4 7 True 2 1 5 3 False 3 0 6 2 False 3 0 7 3 False 3 0 8 6 True 4 2 9 10 True 4 1 10 2 False 5 0 11 1 False 5 0 12 0 False 5 0 13 2 False 5 0 </code></pre> From there on you can easily find the maximum and its index.

Counting the number of consecutive values that meets a condition (Pandas Dataframe)

Tags:

python

pandas

dataframe

numpy

series

So I created this post regarding my problem 2 days ago and got an answer thankfully.

I have a data made of 20 rows and 2500 columns. Each column is a unique product and rows are time series, results of measurements. Therefore each product is measured 20 times and there are 2500 products.

This time I want to know for how many consecutive rows my measurement result can stay above a specific threshold. AKA: I want to count the number of consecutive values that is above a value, let's say 5.

A = [1, 2, 6, 8, 7, 3, 2, 3, 6, 10, 2, 1, 0, 2] We have these values in bold and according to what I defined above, I should get NumofConsFeature = 3 as the result. (Getting the max if there are more than 1 series that meets the condition)

I thought of filtering using .gt, then getting the indexes and using a loop afterwards in order to detect the consecutive index numbers but couldn't make it work.

In 2nd phase, I'd like to know the index of the first value of my consecutive series. For the above example, that would be 3. But I have no idea of how for this one.

Thanks in advance.

948

asked Oct 05 '18 18:10

meliksahturker

1 Answers

Here's another answer using only Pandas functions:

A = [1, 2, 6, 8, 7, 3, 2, 3, 6, 10, 2, 1, 0, 2]
a = pd.DataFrame(A, columns = ['foo'])
a['is_large'] = (a.foo > 5)
a['crossing'] = (a.is_large != a.is_large.shift()).cumsum()
a['count'] = a.groupby(['is_large', 'crossing']).cumcount(ascending=False) + 1
a.loc[a.is_large == False, 'count'] = 0

which gives

    foo  is_large  crossing  count
0     1     False         1      0
1     2     False         1      0
2     6      True         2      3
3     8      True         2      2
4     7      True         2      1
5     3     False         3      0
6     2     False         3      0
7     3     False         3      0
8     6      True         4      2
9    10      True         4      1
10    2     False         5      0
11    1     False         5      0
12    0     False         5      0
13    2     False         5      0

From there on you can easily find the maximum and its index.

answered Sep 20 '22 06:09

Bart

Related questions
                            
                                Simple hash of PIL image
                            
                                Django SearchVector using icontains
                            
                                How to manage two pip versions in conda?
                            
                                Numpy find indices of groups with same value
                            
                                Tensorflow hashtable lookup with arrays
                            
                                Merging pandas dataframes on 2 columns but in either order
                            
                                Python - isinstance() not working as I'd expect
                            
                                what does it mean by 'passed by assignment'?
                            
                                Add a signature, with annotations, to extension methods
                            
                                Write pandas dataframe to Excel with xlsxwriter and include `write_rich_string` formatting
                            
                                How to document the post body using flask-ReSTplus?
                            
                                Normalizing FFT spectrum magnitude to 0dB
                            
                                How to show numpy 2d array as grayscale image in Jupyter Notebook? [duplicate]
                            
                                Is adding project root directory to sys.path a good practice?
                            
                                Pipenv global environment
                            
                                Why does "_" not always give me the last result in interactive shell
                            
                                Swapping list elements in python where the expressions contain function calls
                            
                                How does the Pandas deal with the situation when a column with type "object" is compared with an integer?
                            
                                Matplotlib increase spacing between points on x-axis
                            
                                How to register python microservices with my eureka server (spring boot)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With