Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert strings to float in all pandas columns, where this is possible

I created a pandas dataframe from a list of lists

import pandas as pd

df_list = [["a", "1", "2"], ["b", "3", np.nan]]
df = pd.DataFrame(df_list, columns = list("ABC"))
>>>   A  B    C
   0  a  1    2
   1  b  3  NaN

Is there a way to convert all columns of the dataframe to float, that can be converted, i.e. B and C? The following works, if you know, which columns to convert:

  df[["B", "C"]] = df[["B", "C"]].astype("float")

But what do you do, if you don't know in advance, which columns contain the numbers? When I tried

  df = df.astype("float", errors = "ignore")

all columns are still strings/objects. Similarly,

df[["B", "C"]] = df[["B", "C"]].apply(pd.to_numeric)

converts both columns (though "B" is int and "C" is "float", because of the NaN value being present), but

df = df.apply(pd.to_numeric)

obviously throws an error message and I don't see a way to suppress this.
Is there a possibility to perform this string-float conversion without looping through each column, to try .astype("float", errors = "ignore")?

like image 327
Mr. T Avatar asked Jan 30 '18 10:01

Mr. T


People also ask

How do I change the datatype of multiple columns in pandas?

Change column type in pandas using DataFrame.apply() to_numeric, pandas. to_datetime, and pandas. to_timedelta as arguments to apply the apply() function to change the data type of one or more columns to numeric, DateTime, and time delta respectively.

Which pandas method will convert a column type from object to float?

To convert the column type to float in Pandas DataFrame: use the Series' astype() method. use Pandas' to_numeric() method.

How do you convert a string to a float?

We can convert a string to float in Python using the float() function. This is a built-in function used to convert an object to a floating point number. Internally, the float() function calls specified object __float__() function.

Why can't I convert a string to a float?

The Python "ValueError: could not convert string to float" occurs when we pass a string that cannot be converted to a float (e.g. an empty string or one containing characters) to the float() class. To solve the error, remove all unnecessary characters from the string.


1 Answers

I think you need parameter errors='ignore' in to_numeric:

df = df.apply(pd.to_numeric, errors='ignore')
print (df.dtypes)
A     object
B      int64
C    float64
dtype: object

It working nice if not mixed values - numeric with strings:

df_list = [["a", "t", "2"], ["b", "3", np.nan]]
df = pd.DataFrame(df_list, columns = list("ABC"))

df = df.apply(pd.to_numeric, errors='ignore')
print (df)
   A  B    C
0  a  t  2.0 <=added t to column B for mixed values
1  b  3  NaN

print (df.dtypes)
A     object
B     object
C    float64
dtype: object

EDIT:

You can downcast also int to floats:

df = df.apply(pd.to_numeric, errors='ignore', downcast='float')
print (df.dtypes)
A     object
B    float32
C    float32
dtype: object

It is same as:

df = df.apply(lambda x: pd.to_numeric(x, errors='ignore', downcast='float'))
print (df.dtypes)
A     object
B    float32
C    float32
dtype: object
like image 183
jezrael Avatar answered Sep 19 '22 16:09

jezrael