I get ValueError: cannot convert float NaN to integer for following:
df = pandas.read_csv('zoom11.csv') df[['x']] = df[['x']].astype(int)
Update: Using the hints in comments/answers I got my data clean with this:
# x contained NaN df = df[~df['x'].isnull()] # Y contained some other garbage, so null check was not enough df = df[df['y'].str.isnumeric()] # final conversion now worked df[['x']] = df[['x']].astype(int) df[['y']] = df[['y']].astype(int)
We can replace NaN values with 0 to get rid of NaN values. This is done by using fillna() function. This function will check the NaN values in the dataframe columns and fill the given value.
Use dropna() function to drop rows with NaN / None values in pandas DataFrame. Python doesn't support Null hence any missing data is represented as None or NaN. NaN stands for Not A Number and is one of the common ways to represent the missing value in the data.
Python also has a built-in function to convert floats to integers: int() . In this case, 390.8 will be converted to 390 . When converting floats to integers with the int() function, Python cuts off the decimal and remaining numbers of a float to create an integer.
In Python, the float type has nan .
For identifying NaN
values use boolean indexing
:
print(df[df['x'].isnull()])
Then for removing all non-numeric values use to_numeric
with parameter errors='coerce'
- to replace non-numeric values to NaN
s:
df['x'] = pd.to_numeric(df['x'], errors='coerce')
And for remove all rows with NaN
s in column x
use dropna
:
df = df.dropna(subset=['x'])
Last convert values to int
s:
df['x'] = df['x'].astype(int)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With