Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split int64 Pandas column in two

Tags:

python

pandas

I've been given a dataset that has dates as an integer using the format 52019 for May 2019. I've put it into a Pandas DataFrame, and I need to extract that date format into a month column and year column, but I can't figure out how to do that for an int64 datatype or how to handle it for the two digit months. So I want to take something like

ID    Date
1    22019
2    32019
3    52019
5    102019

and make it become

ID    Month    Year
1     2        2019
2     3        2019
3     5        2019
5     10       2019

What should I do?

like image 360
nostradukemas Avatar asked Nov 29 '22 21:11

nostradukemas


2 Answers

divmod

df['Month'], df['Year'] = np.divmod(df.Date, 10000)

df

   ID    Date  Month  Year
0   1   22019      2  2019
1   2   32019      3  2019
2   3   52019      5  2019
3   5  102019     10  2019

Without mutating original dataframe using assign

df.assign(**dict(zip(['Month', 'Year'], np.divmod(df.Date, 10000))))

   ID    Date  Month  Year
0   1   22019      2  2019
1   2   32019      3  2019
2   3   52019      5  2019
3   5  102019     10  2019
like image 150
piRSquared Avatar answered Dec 01 '22 17:12

piRSquared


Using // and %

df['Month'], df['Year'] = df.Date//10000,df.Date%10000
df
Out[528]: 
   ID    Date  Month  Year
0   1   22019      2  2019
1   2   32019      3  2019
2   3   52019      5  2019
3   5  102019     10  2019
like image 34
BENY Avatar answered Dec 01 '22 15:12

BENY