Say I have a column in a dataframe that has some numbers and some non-numbers
>> df['foo'] 0 0.0 1 103.8 2 751.1 3 0.0 4 0.0 5 - 6 - 7 0.0 8 - 9 0.0 Name: foo, Length: 9, dtype: object
How can I convert this column to np.float
, and have everything else that is not float convert it to NaN
?
When I try:
>> df['foo'].astype(np.float)
or
>> df['foo'].apply(np.float)
I get ValueError: could not convert string to float: -
to_numeric() The best way to convert one or more columns of a DataFrame to numeric values is to use pandas. to_numeric(). This function will try to change non-numeric objects (such as strings) into integers or floating-point numbers as appropriate.
Method 1: Using replace() method Replacing is one of the methods to convert categorical terms into numeric. For example, We will take a dataset of people's salaries based on their level of education. This is an ordinal type of categorical variable. We will convert their education levels into numeric terms.
In pandas 0.17.0
convert_objects
raises a warning:
FutureWarning: convert_objects is deprecated. Use the data-type specific converters pd.to_datetime, pd.to_timedelta and pd.to_numeric.
You could use pd.to_numeric
method and apply it for the dataframe with arg coerce
.
df1 = df.apply(pd.to_numeric, args=('coerce',))
or maybe more appropriately:
df1 = df.apply(pd.to_numeric, errors='coerce')
EDIT
The above method is only valid for pandas version >= 0.17.0
, from docs what's new in pandas 0.17.0:
pd.to_numeric is a new function to coerce strings to numbers (possibly with coercion) (GH11133)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With