Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

error using astype when NaN exists in a dataframe

Tags:

pandas

df      A     B   0   a=10   b=20.10 1   a=20   NaN 2   NaN    b=30.10 3   a=40   b=40.10 

I tried :

df['A'] = df['A'].str.extract('(\d+)').astype(int) df['B'] = df['B'].str.extract('(\d+)').astype(float) 

But I get the following error:

ValueError: cannot convert float NaN to integer

And:

AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas

How do I fix this ?

like image 907
Sun Avatar asked Jan 09 '17 14:01

Sun


People also ask

What does Astype do in pandas?

Pandas DataFrame astype() Method The astype() method returns a new DataFrame where the data types has been changed to the specified type.

How do you use Astype in Python?

Change data type of a series in PandasThe astype() function is used to cast a pandas object to a specified data type. Use a numpy. dtype or Python type to cast entire pandas object to the same type. Alternatively, use {col: dtype, …}, where col is a column label and dtype is a numpy.

What is Astype str in Python?

Python | Pandas Series.astype() to convert Data type of series. Change Data Type for one or more columns in Pandas Dataframe.

How do I drop NaN pandas?

By using dropna() method you can drop rows with NaN (Not a Number) and None values from pandas DataFrame. Note that by default it returns the copy of the DataFrame after removing rows. If you wanted to remove from the existing DataFrame, you should use inplace=True .


1 Answers

If some values in column are missing (NaN) and then converted to numeric, always dtype is float. You cannot convert values to int. Only to float, because type of NaN is float.

print (type(np.nan)) <class 'float'> 

See docs how convert values if at least one NaN:

integer > cast to float64

If need int values you need replace NaN to some int, e.g. 0 by fillna and then it works perfectly:

df['A'] = df['A'].str.extract('(\d+)', expand=False) df['B'] = df['B'].str.extract('(\d+)', expand=False) print (df)      A    B 0   10   20 1   20  NaN 2  NaN   30 3   40   40  df1 = df.fillna(0).astype(int) print (df1)     A   B 0  10  20 1  20   0 2   0  30 3  40  40  print (df1.dtypes) A    int32 B    int32 dtype: object 
like image 197
jezrael Avatar answered Sep 17 '22 15:09

jezrael