How to group by hour in pandas and check data for every hour?

Question

I have a dataframe df

hour  calls  received appointment
6:48  4        2         2
4:02  21       3         2
12:52 31       7         4
2:14  32       5         2
6:45  13       3         2

The hour column is string

I want to calculate the sum and group by the hour in the format like 1-2,2-3.

My approach would be:

df[['hour','calls','received','appointment']].groupby('hour').sum()

Also, I would like to check for every hour, if data is not present for any hour, fill it by zero.

I want the output as :

hour calls received appointment
0-1   0     0         0
1-2   0     0         0
2-3   32    5         2
3-4   0     0         0
4-5   21    3         2
5-6   0     0         0
6-7   17    5         4
...

I'mahdi · Accepted Answer

You can use pandas.resmaple base hour then compute sum on ['calls','received','appointment'] and at the end rename datetime to desired format.

df['time'] = pd.to_datetime(df['hour'])
df = df.set_index('time').resample('H')[['calls','received','appointment']].sum().reset_index()

# rename 2022-07-24 02:00:00 -> (2-3)
df['time'] = df['time'].apply(lambda x: f"{x.hour}-{x.hour+1}")
print(df)

     time  calls  received  appointment
0     2-3     32         5            2
1     3-4      0         0            0
2     4-5     21         3            2
3     5-6      0         0            0
4     6-7     17         5            4
5     7-8      0         0            0
6     8-9      0         0            0
7    9-10      0         0            0
8   10-11      0         0            0
9   11-12      0         0            0
10  12-13     31         7            4

Corralien · Answer

You can use pd.cut on hour column:

# Create labels: 0-1, 1-2, 2-3, ...
labels = [f"{i}-{i+1}" for i in range(24)]

# Extract the hour part and convert it as int
hours = df['hour'].str.split(':').str[0].astype(int)

# Classify your data. The output is a Series with a 'category' dtype
df['hour'] = pd.cut(hours, range(25), labels=labels, right=False)

# Group by range and sum [...]
out = df.groupby('hour', as_index=False).sum()

[...] and as the grouper is Categorical, all observations are displayed :)

observed : bool, default False
This only applies if any of the groupers are Categoricals.
If True: only show observed values for categorical groupers.
If False: show all values for categorical groupers.

Output:

>>> out
     hour  calls  received  appointment
0     0-1      0         0            0
1     1-2      0         0            0
2     2-3     32         5            2
3     3-4      0         0            0
4     4-5     21         3            2
5     5-6      0         0            0
6     6-7     17         5            4
7     7-8      0         0            0
8     8-9      0         0            0
9    9-10      0         0            0
10  10-11      0         0            0
11  11-12      0         0            0
12  12-13     31         7            4
13  13-14      0         0            0
14  14-15      0         0            0
15  15-16      0         0            0
16  16-17      0         0            0
17  17-18      0         0            0
18  18-19      0         0            0
19  19-20      0         0            0
20  20-21      0         0            0
21  21-22      0         0            0
22  22-23      0         0            0
23  23-24      0         0            0

How to group by hour in pandas and check data for every hour?

Tags:

python

date

pandas

dataframe

numpy

Atom Store

2 Answers

I'mahdi

Corralien

Recent Activity

Donate For Us

How to group by hour in pandas and check data for every hour?

Tags:

python

date

pandas

dataframe

numpy

Atom Store

2 Answers

I'mahdi

Corralien

Related questions

Recent Activity

Donate For Us