I get following error while trying to convert object (string) column in Pandas to Int32
which is integer type that allows for NA
values.
df.column = df.column.astype('Int32')
TypeError: object cannot be converted to an IntegerDtype
I'm using pandas version: 0.25.3
As of v0.24, you can use: df['col'] = df['col'].astype(pd.Int32Dtype())
Edit: I should have mentioned that this falls under the Nullable integer documentation. The docs specify other nullable integer types as well (i.e. Int64Dtype, Int8Dtype, UInt64Dtype, etc.)
It's known bug, as explained here.
Workaround is to convert column first to float
and than to Int32
.
Make sure you strip your column from whitespaces before you do conversion:
df.column = df.column.str.strip()
Than do conversion:
df.column = df.column.astype('float') # first convert to float before int
df.column = df.column.astype('Int32')
or simpler:
df.column = df.column.astype('float').astype('Int32') # or Int64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With