I have a dataframe where the first 3 columns are 'MONTH', 'DAY', 'YEAR'
In each column there is an integer. Is there a Pythonic way to convert all three columns into datetimes while there are in the dataframe?
From:
M D Y Apples Oranges 5 6 1990 12 3 5 7 1990 14 4 5 8 1990 15 34 5 9 1990 23 21
into:
Datetimes Apples Oranges 1990-6-5 12 3 1990-7-5 14 4 1990-8-5 15 34 1990-9-5 23 21
You can use the DataFrame. apply() and pd. to_datetime() function to convert multiple columns to DataTime. apply() function applies a function to each and every row and column of the DataFrame.
Step #1: Load numpy and Pandas. Step #2: Create random data and use them to create a pandas dataframe. Step #3: Convert multiple lists into a single data frame, by creating a dictionary for each list with a name. Step #4: Then use Pandas dataframe into dict.
Pandas Combine() Function combine() function which allows us to take a date and time string values and combine them to a single Pandas timestamp object.
In 0.13 (coming very soon), this is heavily optimized and quite fast (but still pretty fast in 0.12); both orders of magnitude faster than looping
In [3]: df Out[3]: M D Y Apples Oranges 0 5 6 1990 12 3 1 5 7 1990 14 4 2 5 8 1990 15 34 3 5 9 1990 23 21 In [4]: df.dtypes Out[4]: M int64 D int64 Y int64 Apples int64 Oranges int64 dtype: object # in 0.12, use this In [5]: pd.to_datetime((df.Y*10000+df.M*100+df.D).apply(str),format='%Y%m%d') # in 0.13 the above or this will work In [5]: pd.to_datetime(df.Y*10000+df.M*100+df.D,format='%Y%m%d') Out[5]: 0 1990-05-06 00:00:00 1 1990-05-07 00:00:00 2 1990-05-08 00:00:00 3 1990-05-09 00:00:00 dtype: datetime64[ns]
In version 0.18.1
you can use to_datetime
, but:
year
, month
, day
, hour
, minute
and second
:year
, month
and day
Sample:
import pandas as pd df = pd.DataFrame({'year': [2015, 2016], 'month': [2, 3], 'day': [4, 5], 'hour': [2, 3], 'minute': [10, 30], 'second': [21,25]}) print df day hour minute month second year 0 4 2 10 2 21 2015 1 5 3 30 3 25 2016 print pd.to_datetime(df[['year', 'month', 'day']]) 0 2015-02-04 1 2016-03-05 dtype: datetime64[ns] print pd.to_datetime(df[['year', 'month', 'day', 'hour']]) 0 2015-02-04 02:00:00 1 2016-03-05 03:00:00 dtype: datetime64[ns] print pd.to_datetime(df[['year', 'month', 'day', 'hour', 'minute']]) 0 2015-02-04 02:10:00 1 2016-03-05 03:30:00 dtype: datetime64[ns] print pd.to_datetime(df) 0 2015-02-04 02:10:21 1 2016-03-05 03:30:25 dtype: datetime64[ns]
Another solution is convert to dictionary
:
print df M D Y Apples Oranges 0 5 6 1990 12 3 1 5 7 1990 14 4 2 5 8 1990 15 34 3 5 9 1990 23 21 print pd.to_datetime(dict(year=df.Y, month=df.M, day=df.D)) 0 1990-05-06 1 1990-05-07 2 1990-05-08 3 1990-05-09 dtype: datetime64[ns]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With