Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to group by date range

Tags:

python

pandas

I am trying to group data from one column by data in another column, but I only want data from a specific time range. So lets say 2015-11-1 to 2016-4-30. My database looks something like this:

account_id    employer_key    login_date
1111111       google          2016-03-03 20:58:36.000000
2222222       walmart         2015-11-18 11:52:56.000000
2222222       walmart         2015-11-18 11:53:14.000000
1111111       google          2016-04-06 23:29:04.000000
3333333       dell_inc        2015-09-05 14:13:53.000000
3333333       dell_inc        2016-01-28 03:20:58.000000
2222222       walmart         2015-09-03 00:11:38.000000
1111111       google          2015-09-03 00:12:25.000000
1111111       google          2015-11-13 01:59:59.000000
4444444       google          2015-11-13 01:59:59.000000
5555555       dell_inc        2015-03-12 01:59:59.000000

I am trying to get an output that looks something like this (where it shows only a 1 or true if the person logged in during that time window and a 0 or false if they didn't):

employer_key  account_id   login_date
google        1111111       1
              4444444       1
walmart       2222222       1
dell_inc      3333333       1
dell_inc      5555555       0

How can I go about doing this?

like image 655
rainbow sherbet Avatar asked Dec 19 '25 22:12

rainbow sherbet


1 Answers

You can do it this way:

In [252]: df.groupby(['employer_key','account_id']) \
     ...:   .apply(lambda x: len(x.query("'2015-11-01' <= login_date <= '2016-04-30'")) > 0) \
     ...:   .reset_index()
Out[252]:
  employer_key  account_id      0
0     dell_inc     3333333   True
1     dell_inc     5555555  False
2       google     1111111   True
3       google     4444444   True
4      walmart     2222222   True

or using boolean indexing:

In [249]: df.groupby(['employer_key','account_id'])['login_date'] \
     ...:   .apply(lambda x: len(x[x.ge('2015-11-01') & x.le('2016-04-30')]) > 0)
Out[249]:
employer_key  account_id
dell_inc      3333333        True
              5555555       False
google        1111111        True
              4444444        True
walmart       2222222        True
Name: login_date, dtype: bool

or additionally using reset_index():

In [250]: df.groupby(['employer_key','account_id'])['login_date'] \
     ...:   .apply(lambda x: len(x[x.ge('2015-11-01') & x.le('2016-04-30')]) > 0) \
     ...:   .reset_index()
Out[250]:
  employer_key  account_id login_date
0     dell_inc     3333333       True
1     dell_inc     5555555      False
2       google     1111111       True
3       google     4444444       True
4      walmart     2222222       True
like image 118
MaxU - stop WAR against UA Avatar answered Dec 22 '25 12:12

MaxU - stop WAR against UA



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!