I have a data frame like this:
import pandas as pd
test_df = pd.DataFrame({'foo':['1','2','92#']})
test_df
foo
0 1
1 2
2 92#
I want to convert the type to int64:
test_df.foo.astype('int64')
but I got error message because '92#' can't be convert to int64:
ValueError: invalid literal for int() with base 10: '92#'
So I want to drop all rows that can't convert to int64, and got my result like this:
foo
0 1
1 2
If you want a solution that applies to the dataFrame as a whole, call pd.to_numeric
through apply
, and use the resultant mask to drop rows:
test_df[test_df.apply(pd.to_numeric, errors='coerce').notna()].dropna()
foo
0 1
1 2
This does not modify test_df
's values. OTOH, if you want to drop rows while converting values, your solution simplifies:
test_df.apply(pd.to_numeric, errors='coerce').dropna()
foo
0 1.0
1 2.0
Add an .astype(int)
call at the end if you want the result typecast to int64
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With