Slice a Pandas dataframe with DatetimeIndex based on time interval

Tags:

I'm trying to accomplish the following...

I got a Pandas dataframe that have a number of entries, indexed with DatetimeIndex which looks a bit like this:

bro_df.info()

<class 'bat.log_to_dataframe.LogToDataFrame'>
DatetimeIndex: 3596641 entries, 2017-12-14 13:52:01.633070 to 2018-01-03 09:59:53.108566
Data columns (total 20 columns):
conn_state        object
duration          timedelta64[ns]
history           object
id.orig_h         object
id.orig_p         int64
id.resp_h         object
id.resp_p         int64
local_orig        bool
local_resp        bool
missed_bytes      int64
orig_bytes        int64
orig_ip_bytes     int64
orig_pkts         int64
proto             object
resp_bytes        int64
resp_ip_bytes     int64
resp_pkts         int64
service           object
tunnel_parents    object
uid               object
dtypes: bool(2), int64(9), object(8), timedelta64[ns](1)
memory usage: 528.2+ MB

What I'm interested in is getting a slice of this data that takes the last entry, 2018-01-03 09:59:53.108566' in this case, and then subtracts an hour from that. This should give me the last hours worth of entries.

What I've tried to do so far is the following:

last_entry = bro_df.index[-1:]
first_entry = last_entry - pd.Timedelta('1 hour')

Which gives me what to me looks like fairly correct values, as per:

print(first_entry)
print(last_entry)

DatetimeIndex(['2018-01-03 08:59:53.108566'], dtype='datetime64[ns]', name='ts', freq=None)
DatetimeIndex(['2018-01-03 09:59:53.108566'], dtype='datetime64[ns]', name='ts', freq=None)

This is also sadly where I get stuck. I've tried various things with bro_df.loc and bro_df.iloc and so on but all I get is different errors for datatypes and not in index etc. Which leads me to think that I possibly might need to convert the first_entry, last_entry variables to another type?

Or I might as usual be barking up entirely the wrong tree.

Any assistance or guidance would be most appreciated.

Cheers, Mike

634

asked Jan 05 '18 09:01

Swedish Mike

1 Answers

It seems you need create scalars by indexing [0] and select by loc:

df = bro_df.loc[first_entry[0]: last_entry[0]]

Or select by exact indexing:

df = bro_df[first_entry[0]: last_entry[0]]

Sample:

rng = pd.date_range('2017-04-03', periods=10, freq='2H 24T')
bro_df = pd.DataFrame({'a': range(10)}, index=rng)  
print (bro_df)
                     a
2017-04-03 00:00:00  0
2017-04-03 02:24:00  1
2017-04-03 04:48:00  2
2017-04-03 07:12:00  3
2017-04-03 09:36:00  4
2017-04-03 12:00:00  5
2017-04-03 14:24:00  6
2017-04-03 16:48:00  7
2017-04-03 19:12:00  8
2017-04-03 21:36:00  9

last_entry = bro_df.index[-1:]
first_entry = last_entry - pd.Timedelta('3 hour')
print (last_entry)
DatetimeIndex(['2017-04-03 21:36:00'], dtype='datetime64[ns]', freq='144T')

print (first_entry)
DatetimeIndex(['2017-04-03 18:36:00'], dtype='datetime64[ns]', freq=None)

print (last_entry[0])
2017-04-03 21:36:00

print (first_entry[0])
2017-04-03 18:36:00

df = bro_df.loc[first_entry[0]: last_entry[0]]
print (df)
                     a
2017-04-03 19:12:00  8
2017-04-03 21:36:00  9

df1 = bro_df[first_entry[0]: last_entry[0]]
print (df1)
                     a
2017-04-03 19:12:00  8
2017-04-03 21:36:00  9

192

answered Nov 30 '22 17:11

jezrael

Related questions
                            
                                pandas map column data based on value from another column using if to determine which dict to use
                            
                                Unknown column 'nan' in 'field list' python pandas
                            
                                How can I multiply a n*m DataFrame with a 1*m DataFrame in pandas?
                            
                                How to add borders to a table in excel sheet created by pandas dataframe?
                            
                                Error, 'only list-like objects are allowed to be passed to isin(), you passed a [int]'
                            
                                Pandas for Python: Exception: Data must be 1-dimensional
                            
                                Convert dataframe to dictionary of list of tuples
                            
                                How can i use Seaborn.lmplot function without naming DataFrame columns?
                            
                                Invert a single column in a DataFrame
                            
                                Get TypeError: Index must be DatetimeIndex when filtering dataframe
                            
                                Apply pandas.to_numeric to selected subset of columns using loc in pandas DataFrame
                            
                                Pandas: Count time interval intersections over a group by
                            
                                Pandas dataframe add integer columns into datetime columns
                            
                                use tuple as index in pandas Series
                            
                                python module 'pandas' has no attribute 'plotting'
                            
                                Quote only the required columns using pandas to_csv
                            
                                Does the Pandas DataFrame.to_sql() function require a subsequent commit()?
                            
                                Pandas DataFrame : groupby then transpose
                            
                                Insert Data to SQL Server Table using pymssql
                            
                                Multiprocessing of a function on a pandas dataframe

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Slice a Pandas dataframe with DatetimeIndex based on time interval

Tags:

datetime

slice

pandas

dataframe

Swedish Mike

People also ask

1 Answers

jezrael

Recent Activity

Donate For Us