For a given timedata - 2018-06-01 06:36:40.047883+00:00, I want to remove microsecond and strip the value after '+'. Most of my dataset contains values like 2018-06-04 11:30:00+00:00 without the microsecond part.
How to have a common date time format for all values?
Let's say you have a mix of different formats that looks like this:
import pandas as pd
df = pd.DataFrame()
df['time'] = ['2018-06-01 06:36:40.047883+00:00', '2018-06-01 06:36:40.047883+00:00', '2018-06-04 11:30:00+00:00', '2018-06-01 06:36:40.047883']
Corresponding output:
time
0 2018-06-01 06:36:40.047883+00:00
1 2018-06-01 06:36:40.047883+00:00
2 2018-06-04 11:30:00+00:00
3 2018-06-01 06:36:40.047883
You wish to get to a common format by removing microseconds and anything after +. In short, you want something that is in Y-M-D H-M-S format.
Currently, let me assume that your column is in string format. So, we now convert this to a datetime format and then replace the microseconds part with 0 and get rid of it.
df['time'] = pd.to_datetime(df['time'])
df['time'] = df['time'].apply(lambda x: x.replace(microsecond = 0))
Output:
time
0 2018-06-01 06:36:40
1 2018-06-01 06:36:40
2 2018-06-04 11:30:00
3 2018-06-01 06:36:40
Another way to achieve that is by using str.split:
t = "2018-06-04 11:30:00+00:00"
t.split('+')[0]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With