I am struggling to round off timestamps using pandas.
The timestamps look like this:
datetime.datetime(2017,06,25,00,31,53,993000)
datetime.datetime(2017,06,25,00,32,31,224000)
datetime.datetime(2017,06,25,00,33,11,223000)
datetime.datetime(2017,06,25,00,33,53,876000)
datetime.datetime(2017,06,25,00,34,31,219000)
datetime.datetime(2017,06,25,00,35,12,634000)
How do I round off to the nearest second?
Previously iv tried suggestions in this post but they didn't work: Rounding time off to the nearest second - Python
My code looks like this so far:
import pandas as pd
filename = 'data.csv'
readcsv = pd.read_csv(filename)
Import data according to file header info
log_date = readcsv.date
log_time = readcsv.time
log_lon = readcsv.lon
log_lat = readcsv.lat
log_heading = readcsv.heading
readcsv['date'] = pd.to_datetime(readcsv['date']).dt.date
readcsv['time'] = pd.to_datetime(readcsv['time']).dt.time
Combine date and time into one variable
timestamp = [datetime.datetime.combine(log_date[i],log_time[i]) for i in range(len(log_date))]
Create dataframe
data = {'timestamp':timestamp,'log_lon':log_lon,'log_lat':log_lat,'log_heading':log_heading}
log_data = pd.DataFrame(data,columns=['timestamp','log_lon','log_lat','log_heading'])
log_data.index = log_data['timestamp']
I am still quite new to python so please excuse my ignorance
To round the Timedelta with specified resolution, use the timestamp. round() method. Set the seconds frequency resolution using the freq parameter with value 's'.
Comparison between pandas timestamp objects is carried out using simple comparison operators: >, <,==,< = , >=. The difference can be calculated using a simple '–' operator. Given time can be converted to pandas timestamp using pandas. Timestamp() method.
You can use first read_csv
with parameter parse_dates
for create datetime
s from column date
and time
and then dt.round
for round datetime
s:
import pandas as pd
temp=u"""date,time,lon,lat,heading
2017-06-25,00:31:53.993000,48.1254,17.1458,a
2017-06-25,00:32:31.224000,48.1254,17.1458,a
2017-06-25,00:33:11.223000,48.1254,17.1458,a
2017-06-25,00:33:53.876000,48.1254,17.1458,a
2017-06-25,00:34:31.219000,48.1254,17.1458,a
2017-06-25,00:35:12.634000,48.1254,17.1458,a"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(pd.compat.StringIO(temp), parse_dates={'timestamp':['date','time']})
print (df)
timestamp lon lat heading
0 2017-06-25 00:31:53.993 48.1254 17.1458 a
1 2017-06-25 00:32:31.224 48.1254 17.1458 a
2 2017-06-25 00:33:11.223 48.1254 17.1458 a
3 2017-06-25 00:33:53.876 48.1254 17.1458 a
4 2017-06-25 00:34:31.219 48.1254 17.1458 a
5 2017-06-25 00:35:12.634 48.1254 17.1458 a
print (df.dtypes)
timestamp datetime64[ns]
lon float64
lat float64
heading object
dtype: object
df['timestamp'] = df['timestamp'].dt.round('1s')
print (df)
timestamp lon lat heading
0 2017-06-25 00:31:54 48.1254 17.1458 a
1 2017-06-25 00:32:31 48.1254 17.1458 a
2 2017-06-25 00:33:11 48.1254 17.1458 a
3 2017-06-25 00:33:54 48.1254 17.1458 a
4 2017-06-25 00:34:31 48.1254 17.1458 a
5 2017-06-25 00:35:13 48.1254 17.1458 a
EDIT:
If you want set column with datetimes to index
also:
import pandas as pd
temp=u"""date,time,lon,lat,heading
2017-06-25,00:31:53.993000,48.1254,17.1458,a
2017-06-25,00:32:31.224000,48.1254,17.1458,a
2017-06-25,00:33:11.223000,48.1254,17.1458,a
2017-06-25,00:33:53.876000,48.1254,17.1458,a
2017-06-25,00:34:31.219000,48.1254,17.1458,a
2017-06-25,00:35:12.634000,48.1254,17.1458,a"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(pd.compat.StringIO(temp), parse_dates={'timestamp':['date','time']}, index_col=['timestamp'])
print (df)
lon lat heading
timestamp
2017-06-25 00:31:53.993 48.1254 17.1458 a
2017-06-25 00:32:31.224 48.1254 17.1458 a
2017-06-25 00:33:11.223 48.1254 17.1458 a
2017-06-25 00:33:53.876 48.1254 17.1458 a
2017-06-25 00:34:31.219 48.1254 17.1458 a
2017-06-25 00:35:12.634 48.1254 17.1458 a
print (df.index)
DatetimeIndex(['2017-06-25 00:31:53.993000', '2017-06-25 00:32:31.224000',
'2017-06-25 00:33:11.223000', '2017-06-25 00:33:53.876000',
'2017-06-25 00:34:31.219000', '2017-06-25 00:35:12.634000'],
dtype='datetime64[ns]', name='timestamp', freq=None)
df.index = df.index.round('1s')
print (df)
lon lat heading
timestamp
2017-06-25 00:31:54 48.1254 17.1458 a
2017-06-25 00:32:31 48.1254 17.1458 a
2017-06-25 00:33:11 48.1254 17.1458 a
2017-06-25 00:33:54 48.1254 17.1458 a
2017-06-25 00:34:31 48.1254 17.1458 a
2017-06-25 00:35:13 48.1254 17.1458 a
dt.round is what you are looking for. I'll just create a smaller version of your DataFrame, please comment if you can't modify it to fully fit your case, I can help with that also.
import datetime
import pandas as pd
ts1 = datetime.datetime(2017,06,25,00,31,53,993000)
ts2 = datetime.datetime(2017,06,25,00,32,31,224000)
ts3 = datetime.datetime(2017,06,25,00,33,11,223000)
df = pd.DataFrame({'timestamp':[ts1, ts2, ts3]})
df.timestamp.dt.round('1s')
Gives you the following:
Out[89]:
0 2017-06-25 00:31:54
1 2017-06-25 00:32:31
2 2017-06-25 00:33:11
Name: timestamp, dtype: datetime64[ns]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With