Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parse_dates in Pandas

The following code can't parse my date column into dates from csv file.

data=pd.read_csv('c:/data.csv',parse_dates=True,keep_date_col = True)  

or

data=pd.read_csv('c:/data.csv',parse_dates=[0])  

data is like following

date          value  30MAR1990    140000  30JUN1990    30000   30SEP1990    120000   30DEC1990    34555 

What did I do wrong? Please help!

Thanks.

like image 398
user3576212 Avatar asked May 22 '14 03:05

user3576212


People also ask

What does parse_dates do in pandas?

We can use the parse_dates parameter to convince pandas to turn things into real datetime types. parse_dates takes a list of columns (since you could want to parse multiple columns into datetimes ).

What is the use of parse_dates?

By default, date columns are represented as object when loading data from a CSV file. To read the date column correctly, we can use the argument parse_dates to specify a list of date columns.

What does parse_dates mean in Python?

parse_dates : boolean or list of ints or names or list of lists or dict, default False. boolean.

When using the read_csv () function in pandas What does the attribute parse_dates true accomplish?

If True and parse_dates is enabled, pandas will attempt to infer the format of the datetime strings in the columns, and if it can be inferred, switch to a faster method of parsing them. In some cases this can increase the parsing speed by 5-10x.


1 Answers

This is a non-standard format, so not caught by the default parser, you can pass your own:

In [11]: import datetime as dt  In [12]: dt.datetime.strptime('30MAR1990', '%d%b%Y') Out[12]: datetime.datetime(1990, 3, 30, 0, 0)  In [13]: parser = lambda date: pd.datetime.strptime(date, '%d%b%Y')  In [14]: pd.read_csv(StringIO(s), parse_dates=[0], date_parser=parser) Out[14]:         date  value 0 1990-03-30  140000 1 1990-06-30   30000 2 1990-09-30  120000 3 1990-12-30   34555 

Another option is to use to_datetime after you've read in the strings:

df['date'] = pd.to_datetime(df['date'], format='%d%b%Y') 
like image 175
Andy Hayden Avatar answered Oct 01 '22 10:10

Andy Hayden