pandas time-series data preprocessing

Question

I have dataframe look likes this :

> dt
    text    timestamp
0   a   2016-06-13 18:00
1   b   2016-06-20 14:08
2   c   2016-07-01 07:41
3   d   2016-07-11 19:07
4   e   2016-08-01 16:00

And I want to summarise every month's data like:

> dt_month
count   timestamp
0   2   2016-06
1   2   2016-07
2   1   2016-08

the original dataset(dt) can be generated by:

import pandas as pd
data = {'text': ['a', 'b', 'c', 'd', 'e'],
    'timestamp': ['2016-06-13 18:00', '2016-06-20 14:08', '2016-07-01 07:41', '2016-07-11 19:07', '2016-08-01 16:00']}
dt = pd.DataFrame(data)

And are there any ways can plot a time-frequency plot by dt_month ?

jezrael · Accepted Answer

You can groupby by timestamp column converted to_period and aggregate size:

print (df.text.groupby(df.timestamp.dt.to_period('m'))
              .size()
              .rename('count')
              .reset_index())

  timestamp  count
0   2016-06      2
1   2016-07      2
2   2016-08      1

pandas time-series data preprocessing

Tags:

python

pandas

dataframe

visualization

time-series

seanDot7

1 Answers

jezrael

Recent Activity

Donate For Us

pandas time-series data preprocessing

Tags:

python

pandas

dataframe

visualization

time-series

seanDot7

1 Answers

jezrael

Related questions

Recent Activity

Donate For Us