Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert datatype:object to float64 in python?

Tags:

python

pandas

I am going around in circles and tried so many different ways so I guess my core understanding is wrong. I would be grateful for help in understanding my encoding/decoding issues.

I import the dataframe from SQL and it seems that some datatypes:float64 are converted to Object. Thus, I cannot do any calculation. I fail to convert the Object back to float64.

df.head()

Date        WD  Manpower 2nd     CTR    2ndU    T1    T2      T3      T4   2013/4/6    6   NaN     2,645   5.27%   0.29    407     533     454     368 2013/4/7    7   NaN     2,118   5.89%   0.31    257     659     583     369 2013/4/13   6   NaN     2,470   5.38%   0.29    354     531     473   383 2013/4/14   7   NaN     2,033   6.77%   0.37    396     748     681     458 2013/4/20   6   NaN     2,690   5.38%   0.29    361     528     541     381 

df.dtypes

WD             float64 Manpower       float64 2nd             object CTR             object 2ndU           float64 T1              object T2              object T3              object T4              object T5              object  dtype: object 

SQL table:

enter image description here

like image 388
Ning Chen Avatar asked Feb 02 '15 11:02

Ning Chen


People also ask

How do you change an object to a float?

Use pandas DataFrame. astype() function to convert column from string/int to float, you can apply this on a specific column or on an entire DataFrame. To cast the data type to 54-bit signed float, you can use numpy. float64 , numpy.

How do I convert an object to a number in Python?

to_numeric() The best way to convert one or more columns of a DataFrame to numeric values is to use pandas. to_numeric(). This function will try to change non-numeric objects (such as strings) into integers or floating-point numbers as appropriate.


2 Answers

You can convert most of the columns by just calling convert_objects:

In [36]:  df = df.convert_objects(convert_numeric=True) df.dtypes Out[36]: Date         object WD            int64 Manpower    float64 2nd          object CTR          object 2ndU        float64 T1            int64 T2          int64 T3           int64 T4        float64 dtype: object 

For column '2nd' and 'CTR' we can call the vectorised str methods to replace the thousands separator and remove the '%' sign and then astype to convert:

In [39]:  df['2nd'] = df['2nd'].str.replace(',','').astype(int) df['CTR'] = df['CTR'].str.replace('%','').astype(np.float64) df.dtypes Out[39]: Date         object WD            int64 Manpower    float64 2nd           int32 CTR         float64 2ndU        float64 T1            int64 T2            int64 T3            int64 T4           object dtype: object In [40]:  df.head() Out[40]:         Date  WD  Manpower   2nd   CTR  2ndU   T1    T2   T3     T4 0   2013/4/6   6       NaN  2645  5.27  0.29  407   533  454    368 1   2013/4/7   7       NaN  2118  5.89  0.31  257   659  583    369 2  2013/4/13   6       NaN  2470  5.38  0.29  354   531  473    383 3  2013/4/14   7       NaN  2033  6.77  0.37  396   748  681    458 4  2013/4/20   6       NaN  2690  5.38  0.29  361   528  541    381 

Or you can do the string handling operations above without the call to astype and then call convert_objects to convert everything in one go.

UPDATE

Since version 0.17.0 convert_objects is deprecated and there isn't a top-level function to do this so you need to do:

df.apply(lambda col:pd.to_numeric(col, errors='coerce'))

See the docs and this related question: pandas: to_numeric for multiple columns

like image 100
EdChum Avatar answered Oct 05 '22 22:10

EdChum


convert_objects is deprecated.

For pandas >= 0.17.0, use pd.to_numeric

df["2nd"] = pd.to_numeric(df["2nd"]) 
like image 20
Sesquipedalism Avatar answered Oct 05 '22 23:10

Sesquipedalism