I have an observational data set which contain weather information. Each column contain specific field in which date and time are in two separate column. The time column contain hourly time like 0000, 0600 .. up to 2300. What I am trying to do is to filter the data set based on certain time frame, for example between 0000 UTC to 0600 UTC. When I try to read the data file in pandas data frame, by default the time column is read in float. When I try to convert it in to datatime object, it produces a format which I am unable to convert. Code example is given below:
import pandas as pd
import datetime as dt
df = pd.read_excel("test.xlsx")
df.head()
which produces the following result:
tdate itime moonph speed ... qnh windir maxtemp mintemp
0 01-Jan-17 1000.0 NM7 5 ... $1,011.60 60.0 $32.60 $22.80
1 01-Jan-17 1000.0 NM7 2 ... $1,015.40 999.0 $32.60 $22.80
2 01-Jan-17 1030.0 NM7 4 ... $1,015.10 60.0 $32.60 $22.80
3 01-Jan-17 1100.0 NM7 3 ... $1,014.80 999.0 $32.60 $22.80
4 01-Jan-17 1130.0 NM7 5 ... $1,014.60 270.0 $32.60 $22.80
Then I extracted the time column with following line:
df["time"] = df.itime
df["time"]
0 1000.0
1 1000.0
2 1030.0
3 1100.0
4 1130.0
5 1200.0
6 1230.0
7 1300.0
8 1330.0
.
.
3261 2130.0
3262 2130.0
3263 600.0
3264 630.0
3265 730.0
3266 800.0
3267 830.0
3268 1900.0
3269 1930.0
3270 2000.0
Name: time, Length: 3279, dtype: float64
Then I tried to convert the time column to datetime object:
df["time"] = pd.to_datetime(df.itime)
which produced the following result:
df["time"]
0 1970-01-01 00:00:00.000001000
1 1970-01-01 00:00:00.000001000
2 1970-01-01 00:00:00.000001030
3 1970-01-01 00:00:00.000001100
It appears that it has successfully converted the data to datetime object. However, it added the hour time to ms which is difficult for me to do filtering.
The final data format I would like to get is either:
1970-01-01 06:00:00
or
06:00
Any help is appreciated.
When you read the excel file specify the dtype
of col itime
as a str
:
df = pd.read_excel("test.xlsx", dtype={'itime':str})
then you will have a time column of strings looking like:
df = pd.DataFrame({'itime':['2300', '0100', '0500', '1000']})
Then specify the format and convert to time:
df['Time'] = pd.to_datetime(df['itime'], format='%H%M').dt.time
itime Time
0 2300 23:00:00
1 0100 01:00:00
2 0500 05:00:00
3 1000 10:00:00
Just addon to Chris answer, if you are unable to convert because there is no zero in the front, apply the following to the dataframe.
df['itime'] = df['itime'].apply(lambda x: x.zfill(4))
So basically is that because the original format does not have even leading digit (4 digit). Example: 945 instead of 0945.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With