Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas Find Date Frequency

Tags:

python

pandas

I have a dataset:

    login                 id
0   2015-06-22 04:55:00   1
1   2015-06-23 05:55:00   1
2   2015-06-25 04:55:00   2
3   2015-06-26 02:55:00   2
4   2015-07-02 04:55:00   2
5   2015-07-12 04:55:00   3
6   2015-07-13 04:55:00   3
7   2015-07-15 04:55:00   5
8   2015-07-21 04:55:00   5
9   2015-07-22 04:55:00   5
10  2015-07-30 04:55:00   5
11  2015-08-30 04:55:00   5
12  2015-06-02 04:55:00   7
13  2015-07-02 04:55:00   7
14  2015-08-02 04:55:00   7

I am using Pandas in Python for the analysis. I would like to check if a particular id has logged in at least 2 times in a 24 hour period, and store those ids in a list called good_id.

For example: id = 1 would not be stored because they logged in twice but in a 25 hour period. id = 2 would be stored because they have.

etc.

like image 539
sci-guy Avatar asked May 25 '26 21:05

sci-guy


1 Answers

Here's a multiple-step approach:

df['last_log'] = df.groupby('id').login.shift().fillna(pd.to_datetime(0))
df['duration'] = df.login - df.last_log

# good ids
df.id[(df['duration'] <= pd.Timedelta(1, 'd'))].unique()

# output: array([2, 3, 5], dtype=int64)
like image 151
Quang Hoang Avatar answered May 27 '26 10:05

Quang Hoang



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!