Pandas: Count Unique Values after Resample

Question

I'm just getting started with Pandas and am trying to combine: Grouping my data by date, and counting the unique values in each group.

Here's what my data looks like:

                  User, Type
Datetime
2014-04-15 11:00:00, A, New
2014-04-15 12:00:00, B, Returning
2014-04-15 13:00:00, C, New
2014-04-20 14:00:00, D, New
2014-04-20 15:00:00, B, Returning
2014-04-20 16:00:00, B, Returning
2014-04-20 17:00:00, D, Returning

And here's what I would like to get to: Resample the datetime index to the day (which I can do), and also count the unique users for each day. I'm not interested in the 'Type' column yet.

Day, Unique Users
2014-04-15, 3
2014-04-20, 2

I'm trying df.user.resample('D', how='count').unique but it doesn't seem to give me the right answer.

Karl D. · Accepted Answer

You don't need to do a resample to get the desired output in your question. I think you can get by with just a groupby on date:

print df.groupby(df.index.date)['User'].nunique()

2014-04-15    3
2014-04-20    2
dtype: int64

And then if you want to you could resample to fill in the time series gaps after you count the unique users:

cnt = df.groupby(df.index.date)['User'].nunique()
cnt.index = cnt.index.to_datetime()
print cnt.resample('D')

2014-04-15     3
2014-04-16   NaN
2014-04-17   NaN
2014-04-18   NaN
2014-04-19   NaN
2014-04-20     2
Freq: D, dtype: float64

mjspier · Answer

I came across the same problem. Resample worked for me with nunique. The nice way with resample is that it makes it very simple to change the sample rate for example to hour or minutes and that the timestamp is kept as index.

df.user.resample('D').nunique()

Pandas: Count Unique Values after Resample

Tags:

python

pandas

Malcolm Bastien

2 Answers

Karl D.

mjspier

Recent Activity

Donate For Us

Pandas: Count Unique Values after Resample

Tags:

python

pandas

Malcolm Bastien

2 Answers

Karl D.

mjspier

Related questions

Recent Activity

Donate For Us