I have a very large Pandas DataFrame that looks like this:
>>> d = pd.DataFrame({"a": ["1", "U", "3.4"]})
>>> d
a
0 1
1 U
2 3.4
Currently the column is set as an object:
>>> d.dtypes
a object
dtype: object
I'd like to convert this column to float so that I can use groupby() and compute the mean. When I try it using astype I correctly get an error because of the string that can't be cast to float:
>>> d.a.astype(float)
ValueError: could not convert string to float: 'U'
What I'd like to do is to cast all the elements to float, and then replace the ones that can't be cast by NaNs.
How can I do this?
I tried setting raise_on_error, but it doesn't work, the dtype is still object.
>>> d.a.astype(float, raise_on_error=False)
0 1
1 U
2 3.4
Name: a, dtype: object
Use to_numeric and specify errors='coerce' to force strings that can't be parsed to a numeric value to become NaN:
>>> pd.to_numeric(d['a'], errors='coerce')
0 1.0
1 NaN
2 3.4
Name: a, dtype: float64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With