Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas, convert datetime format mm/dd/yyyy to dd/mm/yyyy

The default format of csv is dd/mm/yyyy. When I convert it to datetime by df['Date']=pd.to_datetime(df['Date']), it change the format to mm//dd/yyyy.

Then, I used df['Date'] = pd.to_datetime(df['Date']).dt.strftime('%d/%m/%Y') to convert to dd/mm/yyyy, But, they are in the string (object) format. However, I need to change them to datetime format. When I use again this (df['Date']=pd.to_datetime(df['Date'])), it gets back to the previous format. Need your help

like image 872
Amn Kh Avatar asked Oct 18 '18 08:10

Amn Kh


People also ask

How do I change the format of a datetime object in Python?

Use datetime. strftime(format) to convert a datetime object into a string as per the corresponding format . The format codes are standard directives for mentioning in which format you want to represent datetime. For example, the %d-%m-%Y %H:%M:%S codes convert date to dd-mm-yyyy hh:mm:ss format.

How do I change the format of a column in pandas?

The best way to convert one or more columns of a DataFrame to numeric values is to use pandas. to_numeric() . This function will try to change non-numeric objects (such as strings) into integers or floating-point numbers as appropriate.


3 Answers

You can use the parse_dates and dayfirst arguments of pd.read_csv, see: the docs for read_csv()

df = pd.read_csv('myfile.csv', parse_dates=['Date'], dayfirst=True)

This will read the Date column as datetime values, correctly taking the first part of the date input as the day. Note that in general you will want your dates to be stored as datetime objects.

Then, if you need to output the dates as a string you can call dt.strftime():

df['Date'].dt.strftime('%d/%m/%Y')
like image 192
rje Avatar answered Oct 07 '22 10:10

rje


When I use again this: df['Date'] = pd.to_datetime(df['Date']), it gets back to the previous format.

No, you cannot simultaneously have the string format of your choice and keep your series of type datetime. As remarked here:

datetime series are stored internally as integers. Any human-readable date representation is just that, a representation, not the underlying integer. To access your custom formatting, you can use methods available in Pandas. You can even store such a text representation in a pd.Series variable:

formatted_dates = df['datetime'].dt.strftime('%m/%d/%Y')

The dtype of formatted_dates will be object, which indicates that the elements of your series point to arbitrary Python times. In this case, those arbitrary types happen to be all strings.

Lastly, I strongly recommend you do not convert a datetime series to strings until the very last step in your workflow. This is because as soon as you do so, you will no longer be able to use efficient, vectorised operations on such a series.

like image 39
jpp Avatar answered Oct 07 '22 10:10

jpp


This solution will work for all cases where a column has mixed date formats. Add more conditions to the function if needed. Pandas to_datetime() function was not working for me, but this seems to work well.

import date
def format(val):
    a = pd.to_datetime(val, errors='coerce', cache=False).strftime('%m/%d/%Y')
    try:
        date_time_obj = datetime.datetime.strptime(a, '%d/%m/%Y')
    except:
        date_time_obj = datetime.datetime.strptime(a, '%m/%d/%Y')
    return date_time_obj.date()

Saving the changes to the same column.

df['Date'] = df['Date'].apply(lambda x: format(x))

Saving as CSV.

df.to_csv(f'{file_name}.csv', index=False, date_format='%s')
like image 33
SummmerFort Avatar answered Oct 07 '22 09:10

SummmerFort