Using Python 3.6 and Pandas 0.19.2: How do you read in an excel file and change a column to datetime straight from read_excel
? Similar to This Question about converters and dtypes. But I want to read in a certain column as datetime
I want to change this:
import pandas as pd
import datetime
import numpy as np
file = 'PATH_HERE'
df1 = pd.read_excel(file)
df1['COLUMN'] = pd.to_datetime(df1['COLUMN']) # <--- Line to get rid of
into something like:
df1 = pd.read_excel(file, dtypes= {'COLUMN': datetime})
The code does not error, but in my example, COLUMN
is still a dtype of int64
after calling print(df1['COLUMN'].dtype)
I have tried using np.datetime64
instead of datetime
. I have also tried using converters=
instead of dtypes=
but to no avail. This may be nit picky, but would be a nice feature to implement in my code.
We can use the pandas module read_excel() function to read the excel file data into a DataFrame object. If you look at an excel sheet, it's a two-dimensional table. The DataFrame object also represents a two-dimensional tabular data structure.
The read_excel() method from the pandas library reads excel files, that is, files in the . xls format. It takes the file name or directory as the first argument with the sheet name as the second argument value. As a matter of course, it takes an excel file as input and returns it as a DataFrame.
Read an Excel file into a pandas DataFrame. Supports xls , xlsx , xlsm , xlsb , odf , ods and odt file extensions read from a local filesystem or URL.
In order to read data from csv or excel files you can use pandas library. The function is read_csv() or read_excel() from pandas. You have to provide the file path as a string.
Typically reading excel sheets will use the dtypes defined in the excel sheets but you cannot specify the dtypes like in read_csv
for example. You can provide a converters
arg for which you can pass a dict of the column and func to call to convert the column:
df1 = pd.read_excel(file, converters= {'COLUMN': pd.to_datetime})
read_excel
supports dtype
, just as read_csv
, as of this writing:
import datetime
import pandas as pd
xlsx = pd.ExcelFile('path...')
df = pd.read_excel(xlsx, dtype={'column_name': datetime.datetime})
https://pandas.pydata.org/docs/reference/api/pandas.read_excel.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With