Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas read sql integer became float

Tags:

I met a problem that when I use pandas to read Mysql table, some columns (see 'to_nlc') used to be integer became a float number (automatically add .0 after that). Can anyone figure it out? Or some guessings? Thanks very much!

enter image description here enter image description hereenter image description here

like image 898
Z Xie Avatar asked Jun 13 '16 18:06

Z Xie


People also ask

What does PD Read_sql do?

read_sql. Read SQL query or database table into a DataFrame. This function is a convenience wrapper around read_sql_table and read_sql_query (for backward compatibility).

Is pandas read SQL slow?

Reading SQL queries into Pandas dataframes is a common task, and one that can be very slow. Depending on the database being used, this may be hard to get around, but for those of us using Postgres we can speed this up considerably using the COPY command.

Is Panda faster than SQL?

This main difference can mean that the two tools are separate, however, you can also perform several of the same functions in each respective tool, for example, you can create new features from existing columns in pandas, perhaps easier and faster than in SQL.


1 Answers

Problem is your data contains NaN values, so int is automatically cast to float.

I think you can check NA type promotions:

When introducing NAs into an existing Series or DataFrame via reindex or some other means, boolean and integer types will be promoted to a different dtype in order to store the NAs. These are summarized by this table:

Typeclass   Promotion dtype for storing NAs
floating    no change
object      no change
integer     cast to float64
boolean     cast to object

While this may seem like a heavy trade-off, in practice I have found very few cases where this is an issue in practice. Some explanation for the motivation here in the next section.

like image 82
jezrael Avatar answered Sep 27 '22 20:09

jezrael