df A B 0 a=10 b=20.10 1 a=20 NaN 2 NaN b=30.10 3 a=40 b=40.10
I tried :
df['A'] = df['A'].str.extract('(\d+)').astype(int) df['B'] = df['B'].str.extract('(\d+)').astype(float)
But I get the following error:
ValueError: cannot convert float NaN to integer
And:
AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas
How do I fix this ?
Pandas DataFrame astype() Method The astype() method returns a new DataFrame where the data types has been changed to the specified type.
Change data type of a series in PandasThe astype() function is used to cast a pandas object to a specified data type. Use a numpy. dtype or Python type to cast entire pandas object to the same type. Alternatively, use {col: dtype, …}, where col is a column label and dtype is a numpy.
Python | Pandas Series.astype() to convert Data type of series. Change Data Type for one or more columns in Pandas Dataframe.
By using dropna() method you can drop rows with NaN (Not a Number) and None values from pandas DataFrame. Note that by default it returns the copy of the DataFrame after removing rows. If you wanted to remove from the existing DataFrame, you should use inplace=True .
If some values in column are missing (NaN
) and then converted to numeric, always dtype
is float
. You cannot convert values to int
. Only to float
, because type
of NaN
is float
.
print (type(np.nan)) <class 'float'>
See docs how convert values if at least one NaN
:
integer > cast to float64
If need int values you need replace NaN
to some int
, e.g. 0
by fillna
and then it works perfectly:
df['A'] = df['A'].str.extract('(\d+)', expand=False) df['B'] = df['B'].str.extract('(\d+)', expand=False) print (df) A B 0 10 20 1 20 NaN 2 NaN 30 3 40 40 df1 = df.fillna(0).astype(int) print (df1) A B 0 10 20 1 20 0 2 0 30 3 40 40 print (df1.dtypes) A int32 B int32 dtype: object
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With