I am importing an excel file into a pandas dataframe with the <code>pandas.read_excel()</code> function. One of the columns is the primary key of the table: it's all numbers, but it's stored as text (the little green triangle in the top left of the Excel cells confirms this). However, when I import the file into a pandas dataframe, the column gets imported as a float. This means that, for example, '0614' becomes 614. Is there a way to specify the datatype when importing a column? I understand this is possible when importing CSV files but couldn't find anything in the syntax of <code>read_excel()</code>. The only solution I can think of is to add an arbitrary letter at the beginning of the text (converting '0614' into 'A0614') in Excel, to make sure the column is imported as text, and then chopping off the 'A' in python, so I can match it to other tables I am importing from SQL.

You just specify converters. I created an excel spreadsheet of the following structure: <pre class="prettyprint"><code>names ages bob 05 tom 4 suzy 3 </code></pre> Where the "ages" column is formatted as strings. To load: <pre class="prettyprint"><code>import pandas as pd df = pd.read_excel('Book1.xlsx',sheetname='Sheet1',header=0,converters={'names':str,'ages':str}) >>> df names ages 0 bob 05 1 tom 4 2 suzy 3 </code></pre>

Python pandas: how to specify data types when reading an Excel file?

Tags:

python

pandas

dataframe

I am importing an excel file into a pandas dataframe with the pandas.read_excel() function.

One of the columns is the primary key of the table: it's all numbers, but it's stored as text (the little green triangle in the top left of the Excel cells confirms this).

However, when I import the file into a pandas dataframe, the column gets imported as a float. This means that, for example, '0614' becomes 614.

Is there a way to specify the datatype when importing a column? I understand this is possible when importing CSV files but couldn't find anything in the syntax of read_excel().

The only solution I can think of is to add an arbitrary letter at the beginning of the text (converting '0614' into 'A0614') in Excel, to make sure the column is imported as text, and then chopping off the 'A' in python, so I can match it to other tables I am importing from SQL.

959

asked Sep 15 '15 16:09

Pythonista anonymous

1 Answers

You just specify converters. I created an excel spreadsheet of the following structure:

names   ages bob     05 tom     4 suzy    3

Where the "ages" column is formatted as strings. To load:

import pandas as pd  df = pd.read_excel('Book1.xlsx',sheetname='Sheet1',header=0,converters={'names':str,'ages':str}) >>> df        names ages    0   bob   05    1   tom   4    2   suzy  3

162

answered Sep 26 '22 23:09

tnknepp

Related questions
                            
                                Python argparse mutual exclusive group
                            
                                FileNotFoundError: [Errno 2] No such file or directory [duplicate]
                            
                                Passing IPython variables as arguments to bash commands
                            
                                What is this odd sorting algorithm?
                            
                                How can I use redis with Django?
                            
                                prevent scientific notation in matplotlib.pyplot [duplicate]
                            
                                Block scope in Python
                            
                                How do I sort unicode strings alphabetically in Python?
                            
                                In Python, how to display current time in readable format
                            
                                Pandas: how to change all the values of a column?
                            
                                Set Django's FileField to an existing file
                            
                                List of dicts to/from dict of lists
                            
                                Defining the midpoint of a colormap in matplotlib
                            
                                Can I make an admin field not required in Django without creating a form?
                            
                                Python's lambda with underscore for an argument?
                            
                                Declare function at end of file in Python
                            
                                matplotlib y-axis label on right side
                            
                                Scatter plot and Color mapping in Python
                            
                                Ambiguity in Pandas Dataframe / Numpy Array "axis" definition
                            
                                How to read HDF5 files in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With