Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

filter pandas dataframe by time

I have a pandas dataframe which I want to subset on time greater or less than 12pm. First i convert my string datetime to datetime[64]ns object in pandas.

segments_data['time'] = pd.to_datetime((segments_data['time']))

Then I separate time,date,month,year & dayofweek like below.

import datetime as dt

segments_data['date'] = segments_data.time.dt.date
segments_data['year'] = segments_data.time.dt.year
segments_data['month'] = segments_data.time.dt.month
segments_data['dayofweek'] = segments_data.time.dt.dayofweek
segments_data['time'] = segments_data.time.dt.time

My time column looks like following.

segments_data['time']
Out[1906]: 
  07:43:00
  07:52:00
  08:00:00
  08:42:00
  09:18:00
  09:18:00
  09:18:00
  09:23:00
  12:32:00
  12:43:00
  12:55:00
  Name: time, dtype: object

Now I want to subset dataframe with time greater than 12pm and time less than 12pm.

segments_data.time[segments_data['time'] < 12:00:00]

It doesn't work because time is a string object.

like image 391
Neil Avatar asked Jan 28 '16 04:01

Neil


1 Answers

Even though this post is 5 years old I just ran into this same problem and decided to post what I was able to get to work. I tried the between_time function but that did not work for me because the index on the dataframe had to be a datetime and I wanted to use one of the dataframe time columns to filter.

        # Import datetime libraries
    from datetime import datetime, date, time
        
    avail_df['Start'].dt.time
            1   08:36:44
            2   08:49:14
            3   09:26:00
            5   08:34:22
            7   08:34:19
            8   09:09:05
            9   12:27:43
           10   12:29:14
           12   09:05:55
           13   09:14:11
           14   09:21:41
           15   11:28:26
           16   12:25:10
           17   16:02:52
           18   08:53:51


# Use "time()" function to create start/end parameter I used 9:00am for this example 
avail_df.loc[avail_df['Start'].dt.time > time(9,00)]

3   09:26:00
8   09:09:05
9   12:27:43
10  12:29:14
12  09:05:55
13  09:14:11
14  09:21:41
15  11:28:26
16  12:25:10
17  16:02:52
20  09:04:50
21  09:21:35
22  09:22:05
23  09:47:05
24  09:55:05
like image 140
Mike Avatar answered Sep 30 '22 14:09

Mike