Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Adding a fixed date to pandas dataframe

Tags:

python

pandas

I am reading some data and creating a dataframe with from_records in which the data contains a text timestamp HH:MM:SS:000000. I can convert to timeseries with pd.to_datetime(data.timestamp, format='%H:%M:%S:%f'). I know the date of the file from the filename. What is a pythonic and performant way to insert the date (and eventually set it as the index)?

Data looks like:

12:00:00:000000 100
12:00:01:123456 200
12:00:02:000000 300

Without the date inserted I get a dataframe that looks like:

1900-01-01 12:00:00.000000 100
1900-01-01 12:00:01.123456 200
1900-01-01 12:00:02.000000 300

And what I'd want is (given date = datetime.date(2017, 6, 28):

2017-06-28 12:00:00.000000 100
2017-06-28 12:00:01.123456 200
2017-06-28 12:00:02.000000 300

pd.to_datetime origin arg sounded like what I want, but it requires the input as a numeric timestamp rather than a string.

like image 404
Kyle Avatar asked Apr 25 '26 13:04

Kyle


1 Answers

You can create string by strftime from date and add it to column time:

df['datetime'] = pd.to_datetime(date.strftime('%Y-%m-%d ') + df['time'],
                                format='%Y-%m-%d %H:%M:%S:%f')

print (df)
              time    A                   datetime
0  12:00:00:000000  100 2017-06-28 12:00:00.000000
1  12:00:01:123456  200 2017-06-28 12:00:01.123456
2  12:00:02:000000  300 2017-06-28 12:00:02.000000

And for index:

df.index = pd.to_datetime(date.strftime('%Y-%m-%d ') + df['time'],
                                format='%Y-%m-%d %H:%M:%S:%f')

print (df)
                                       time    A
time                                            
2017-06-28 12:00:00.000000  12:00:00:000000  100
2017-06-28 12:00:01.123456  12:00:01:123456  200
2017-06-28 12:00:02.000000  12:00:02:000000  300

Another solution:

date = datetime.date(2017, 6, 28)
days = date - datetime.date(1900, 1, 1)

df['datetime'] = pd.to_datetime(df['time'],format='%H:%M:%S:%f') + 
                 pd.to_timedelta(days, unit='d')

print (df)
              time    A                   datetime
0  12:00:00:000000  100 2017-06-28 12:00:00.000000
1  12:00:01:123456  200 2017-06-28 12:00:01.123456
2  12:00:02:000000  300 2017-06-28 12:00:02.000000
like image 186
jezrael Avatar answered Apr 28 '26 02:04

jezrael



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!