Resample dataframe with count method for certain conditions

Question

I'm trying to resample data from a dataframe. Columns have different types of data. For one of the columns I'd like to count the rows for which the column has a value larger than 0.

A small example would look like this:

import pandas as pd
import numpy as np

df = pd.DataFrame(data={'Date': pd.date_range('2018-01-01','2018-01-15'),
                        'A': np.random.randint(5, size=15)})
df.set_index(df.Date, inplace=True)

df.resample('5D').count()

Counting works, but I can't find a way to insert the condition that I only want to count values larger than 0. Something like this:

df.resample('5D').count(df[df.A > 0])

However, TypeError: 'DataFrame' objects are mutable, thus they cannot be hashed

Question: How to resample().count() with conditions

jezrael · Accepted Answer

You can use Resampler.apply and sum of Trues values which are processes like 1s:

np.random.seed(57)

import pandas as pd
import numpy as np

df = pd.DataFrame(data={'Date': pd.date_range('2018-01-01','2018-01-15'),
                        'A': np.random.randint(5, size=15)})
df.set_index(df.Date, inplace=True)

df1 = df.resample('5D')['A'].apply(lambda x: (x > 0).sum())
print (df1)
Date
2018-01-01    2
2018-01-06    3
2018-01-11    4
Name: A, dtype: int64

Or better solution is create boolean mask and with resample aggregate sum:

df1 = (df['A'] > 0).resample('5D').sum().astype(int)
print (df1)

Date
2018-01-01    2
2018-01-06    3
2018-01-11    4
Name: A, dtype: int32

Resample dataframe with count method for certain conditions

Tags:

python-3.x

pandas

Jeroen

1 Answers

jezrael

Recent Activity

Donate For Us

Resample dataframe with count method for certain conditions

Tags:

python-3.x

pandas

Jeroen

1 Answers

jezrael

Related questions

Recent Activity

Donate For Us