I have a dataframe with a column with dtype('int64')
. The values in the column range from 0-10. The dataframe has 770K rows and 56 columns of different types. When I run the code below, I get dtype('int64')
. I would have thought that the result would have been at a minimum to downcast to int32
or int16
. Here's a replicable example.
import pandas as pd
df = pd.DataFrame([x for x in range(10)]*77000, columns=['recommendation'])
df.dtypes
df.recommendation.apply(lambda x: pd.to_numeric(x, downcast='integer')).dtypes
The apply
method works cell-by-cell, so it cannot figure out that the whole column can be downcast.
You need to call to_numeric
on the whole column, as indicated by Ben in comment:
pd.to_numeric(df.recommendation,downcast='integer')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With