Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Combine year, month and day in Python to create a date

I have a dataframe that consists of separate columns for year, month and day. I tried to combine these individual columns into one date using:

df['myDt']=pd.to_datetime(df[['year','month','day']])

only to get the following error: "to assemble mappings requires at least that [year, month, day] be specified: [day,month,year] is missing". Not sure what this means....I'm already supplying the relevant columns. On checking the datatypes, I found that they Year, Month and Day columns are int64. Would that be causing an issue? Thanks, Chet

Thank you all for posting. As suggested, I'm posting the sample data set first: Value mm yy dd Date
2018-11-30 88.550067 11 2018 1 2018-12-31 88.906290 12 2018 1 2019-01-31 88.723000 1 2019 1 2019-02-28 89.509179 2 2019 1 2019-03-31 90.049161 3 2019 1 2019-04-30 90.523100 4 2019 1 2019-05-31 90.102484 5 2019 1 2019-06-30 91.179400 6 2019 1 2019-07-31 90.963570 7 2019 1 2019-08-31 92.159170 8 2019 1

The data source is:https://www.quandl.com/data/EIA/STEO_NGPRPUS_M I imported the data as follows: 1. import quandl (used conda install first) 2. Used Quandl's Python code:

data=quandl.get("EIA/STEO_NGPRPUS_M", authtoken="TOKEN","2005-01-01","2005-12-31") 4. Just to note, the original data comes only with the Value column, and DateTime as index. I extracted and created the mm,yy and dd columns (month, year, and dd is a column vector set to 1) All I'm trying to do is create another column called "first of the month" - so for each day of each month, the column will just show "MM/YY/1". I'm going to try out all the suggestions below shortly and get back to you guys. Thanks!!

like image 580
Chet Avatar asked Sep 24 '19 03:09

Chet


People also ask

How do I combine month and year in pandas?

One of the ways to combine 3 columns corresponding to Year, Month, and Day in a dataframe is to parse them as date variable while loading the file as Pandas dataframe. While loading the file as Pandas' data frame using read_csv() function we can specify the column names to be combined into datetime column.


2 Answers

Solution

You could use datetime.datetime along with .apply().

import datetime

d = datetime.datetime(2020, 5, 17)
date = d.date()

For pandas.to_datetime(df)

It looks like your code is fine. See pandas.to_datetime documentation and How to convert columns into one datetime column in pandas?.

df = pd.DataFrame({'year': [2015, 2016],
                   'month': [2, 3],
                   'day': [4, 5]})
pd.to_datetime(df[["year", "month", "day"]])

Output:

0   2015-02-04
1   2016-03-05
dtype: datetime64[ns]

What if your YEAR, MONTH and DAY columns have different headers?

Let's say your YEAR, MONTH and DAY columns are labeled as yy, mm and dd respectively. And you prefer to keep your column names unchanged. In that case you could do it as follows.

import pandas as pd

df = pd.DataFrame({'yy': [2015, 2016],
                   'mm': [2, 3],
                   'dd': [4, 5]})
df2 = df[["yy", "mm", "dd"]].copy()
df2.columns = ["year", "month", "day"]
pd.to_datetime(df2)

Output:

0   2015-02-04
1   2016-03-05
dtype: datetime64[ns]
like image 162
CypherX Avatar answered Oct 04 '22 14:10

CypherX


Here is a two liner:

df['dateInt']=df['year'].astype(str) + df['month'].astype(str).str.zfill(2)+ df['day'].astype(str).str.zfill(2)
df['Date'] = pd.to_datetime(df['dateInt'], format='%Y%m%d')

Output

    year  month day dateInt     Date
0   2015    5   20  20150520    2015-05-20
1   2016    6   21  20160621    2016-06-21
2   2017    7   22  20170722    2017-07-22
3   2018    8   23  20180823    2018-08-23
4   2019    9   24  20190924    2019-09-24
like image 39
Grant Shannon Avatar answered Oct 04 '22 15:10

Grant Shannon