I am working with python pandas and MS excel to edit a xlsx file. I iterate between these programs back and forth. The file contains some columns with text that looks like numbers, e.g.,
If I read this, I get
pd.read_excel ('test.xlsx')
A
0 1
1 100
and
pd.read_excel ('test.xlsx').dtypes
A int64
dtype: object
My question is: how is it possible to read the text as text? It is not an option to parse it back after reading, because part of the information (i.e., the leading zeros) is lost upon conversion to a number.
Thank you for your help.
You can work around the known issue (assuming that you know the column name) by using the 'converters' parameter:
>>> pd.read_excel('test.xlsx', converters={'A': str})
A
0 001
1 100
>>> pd.read_excel('test.xlsx', converters={'A': str}).dtypes
A object
dtype: object
According to this issue, it's a known problem with pandas.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With