Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pd.to_datetime change date format producing wrong dates

Tags:

python

pandas

I have extracted the table below from a csv file :

timestamp   user_id   cost   val1
01/01/2011     1     1     3
01/07/2012     1     19     57
01/09/2013     1     21     63
01/02/2011     2     20     8

And I for this purpose, I used the following statements : import pandas as pd

newnames = ['date','user_id', 'cost', 'val1']
df = pd.read_csv('expenses.csv', names = newnames, header = False)
df['timestamp'] = pd.to_datetime(df['timestamp'],format='%d%m%Y')

But the dates of table obtained in df are wrong : months and days are permutated without reason, i.e 01/09/2013 becomes 09/01/2013 without reason.

Many thanks in advance for your help.

like image 301
Space Avatar asked Apr 01 '14 12:04

Space


People also ask

How do I change the format of a date column?

Select the cell, cell range, or column that you want to reformat. Click Number Format and pick the date format you want.

How do I convert a string to a datetime in Python?

We can convert a string to datetime using strptime() function. This function is available in datetime and time modules to parse a string to datetime and time objects respectively.


2 Answers

Hi it happens sometimes due to US/Europe time parsing compatibilties.

What I follow is,

Case 1

If you are sure that your input file is having the time stamp in a correct format, and you want to use it on top of default parser, like in this case, use (as advised by @filmor) dayfirst option in pd.to_datetime

pd.to_datetime(df['timestamp'], dayfirst=True)

Case 2

When you are not sure about the correctness of the format, use the default, pd.to_datetime(df['timestamp'], dayfirst=True)

like image 197
Pranzell Avatar answered Nov 03 '22 01:11

Pranzell


It is most certainly not random. Pandas defaults to the US date format and falls back if that doesn't make sense, i.e. "12/3/2014" becomes 2014-12-03 while "13/3/2014" will be parsed as 2014-03-13.

You can pass dayfirst=True to pd.read_csv to force European-style date parsing.

like image 22
filmor Avatar answered Nov 03 '22 00:11

filmor