I am currently using python trying to split a datetime column into 2, one for Date and one for time and also have the column properly formatted.
ORIGINAL DATASET
INCIDENT_DATE
12/31/2006 11:20:00 PM
12/31/2006 11:30:00 PM
01/01/2007 00:25
01/01/2007 00:10
12/31/2006 11:30:00 AM
01/01/2007 00:05
01/01/2007 00:01
12/31/2006 4:45:00 PM
12/31/2006 11:50:00 PM
**01/01/2007**
*I have used 2 codes, one to format the column and the other that splits it. However, after formatting the column, missing time values were giving 00:00:00 value, here indicating a time for 12 midnight.See below
AFTER FORMATTING
2006-12-31 23:20:00
2006-12-31 23:30:00
2007-01-01 00:25:00
2007-01-01 00:10:00
2006-12-31 11:30:00
2007-01-01 00:05:00
2007-01-01 00:01:00
2006-12-31 16:45:00
2006-12-31 23:50:00
**2007-01-01 00:00:00**
Codes used:
## Format datetime column
crimeall['INCIDENT_DATE'] = pd.DatetimeIndex(crimeall['INCIDENT_DATE'])
##Split DateTime column
crimeall['TIME'],crimeall['DATE']= crimeall['INCIDENT_DATE'].apply(lambda x:x.time()), crimeall['INCIDENT_DATE'].apply(lambda x:x.date())
Is there away to do this without having the missing time value set at 00:00:00? Is it possible to have these missing values recorded as Nan while formatting the datetime?
Any thoughts on how I can achieve a formatted datetime showing the missing time values as NaN.
WHAT I WOULD LIKE IT TO LOOK LIKE
2006-12-31 23:20:00
2006-12-31 23:30:00
2007-01-01 00:25:00
2007-01-01 00:10:00
2006-12-31 11:30:00
2007-01-01 00:05:00
2007-01-01 00:01:00
2006-12-31 16:45:00
2006-12-31 23:50:00
**2007-01-01 NaN**
Hoping that there is a way to get this done.
Add ambiguous =‘NaT’
to pd.DatetimeIndex
. If that doesn't work, you could always patch the values using something like
crimeall['TIME'] = [np.NaN if t.isoformat()=='00:00:00' else t for t in crimeall['TIME']]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With