Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Counting the amount of times a boolean goes from True to False in a column

I have a column in a dataframe which is filled with booleans and i want to count how many times it changes from True to False.

I can do this when I convert the booleans to 1's and 0's ,then use df.diff and then divide that answer by 2

import pandas as pd

d = {'Col1': [True, True, True, False, False, False, True, True, True, True, False, False, False, True, True, False, False, True, ]}


df = pd.DataFrame(data=d)


print(df)

0    True
1    True
2    True
3   False
4   False
5   False
6    True
7    True
8    True
9    True
10  False
11  False
12  False
13   True
14   True
15  False
16  False

My expected outcome would be The amount of times False came up is 3

like image 931
Martijn van Amsterdam Avatar asked Jan 16 '19 15:01

Martijn van Amsterdam


People also ask

How do you count boolean values in a column pandas?

Select the Dataframe column using the column name and subscript operator i.e. df['C']. It returns the column 'C' as a Series object of only bool values. After that, call the sum() function on this boolean Series object, and it will return the count of only True values in the Series/column.

How do you count the number of true values in a data frame?

We can count by using the value_counts() method. This function is used to count the values present in the entire dataframe and also count values in a particular column.

How do you count true and false values in Python?

Use count_nonzero() to count True elements in NumPy array In Python, False is equivalent to 0 , whereas True is equivalent to 1 i.e. a non-zero value. Numpy module provides a function count_nonzero(arr, axis=None), which returns the count of non zero values in a given numpy array.


2 Answers

You can perform a bitwise and of the Col1 with a mask indicating where changes occur in successive rows:

(df.Col1 & (df.Col1 != df.Col1.shift(1))).sum()
3

Where the mask, is obtained by comparing Col1 with a shifted version of itself (pd.shift):

df.Col1 != df.Col1.shift(1)

0      True
1     False
2     False
3      True
4     False
5     False
6      True
7     False
8     False
9     False
10     True
11    False
12    False
13     True
14    False
15    False
16    False
17    False
Name: Col1, dtype: bool

For multiple columns, you can do exactly the same (Here I tested with a col2 identical to col1)

(df & (df != df.shift(1))).sum()

Col1    3
Col2    3
dtype: int64
like image 132
yatu Avatar answered Sep 21 '22 08:09

yatu


Notice that subtracting True (1) from False (0) in integer terms gives -1:

res = df['Col1'].astype(int).diff().eq(-1).sum()  # 3

To apply across a Boolean dataframe, you can construct a series mapping label to count:

res = df.astype(int).diff().eq(-1).sum()
like image 25
jpp Avatar answered Sep 19 '22 08:09

jpp