Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to count the number of occurences before a particular value in dataframe python?

I have a dataframe like below:

A   B   C
1   1   1
2   0   1
3   0   0
4   1   0
5   0   1
6   0   0
7   1   0

I want the number of occurence of zeroes from df['B'] under the following condition:

if(df['B']<df['C']):
  #count number of zeroes in df['B'] until it sees 1.

expected output:

A   B   C  output
1   1   1   Nan
2   0   1   1
3   0   0   Nan
4   1   0   Nan
5   0   1   1
6   0   1   0
7   1   0   Nan

I dont know how to formulate the count part. Any help is really appreciated

like image 369
hakuna_code Avatar asked Sep 13 '19 14:09

hakuna_code


People also ask

How do you count occurrences in a DataFrame in Python?

Using the size() or count() method with pandas. DataFrame. groupby() will generate the count of a number of occurrences of data present in a particular column of the dataframe.

How do you count a specific value in pandas?

Use Sum Function to Count Specific Values in a Column in a Dataframe. We can use the sum() function on a specified column to count values equal to a set condition, in this case we use == to get just rows equal to our specific data point.

How do you count a certain value in Python?

The count() is a built-in function in Python. It will return you the count of a given element in a list or a string. In the case of a list, the element to be counted needs to be given to the count() function, and it will return the count of the element. The count() method returns an integer value.

How do you count occurrences of a string in pandas?

The str. count() function is used to count occurrences of pattern in each string of the Series/Index. This function is used to count the number of times a particular regex pattern is repeated in each of the string elements of the Series.


1 Answers

IIUC one approach would be using a custom grouper and aggregating with groupby.cumcount:

c1 = df.B.lt(df.C)
g = df.B.eq(1).cumsum()
df['out'] = c1.groupby(g).cumcount(ascending=False).shift().where(c1).sub(1)

print(df)

   A  B  C  out
0  1  1  1  NaN
1  2  0  1  1.0
2  3  0  0  NaN
3  4  1  0  NaN
4  5  0  1  1.0
5  6  0  1  0.0
6  7  1  0  NaN
like image 162
yatu Avatar answered Oct 15 '22 22:10

yatu