Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Exception handling when changing Pandas dataframe type

Tags:

python

pandas

I have a Pandas dataframe with a single column of strings. I want to convert the column data to float. Some of the values cannot be converted to float due to their format. I want to omit these "illegal strings" from the result and only extract values that can be legally re-cast as floats. The starting data:

test=pd.DataFrame()
test.loc[0,'Value']='<3'
test.loc[1,'Value']='10'
test.loc[2,'Value']='Detected'
test.loc[3,'Value']=''

The desired output contains only strings that could be re-cast as floats (in this case, 10):

cleanDF=test['Value'].astype(float)
cleanDF
0    10
Name: Value, dtype: float64

Of course, this throws an error as expected on the illegal string for float conversion:

ValueError: could not convert string to float: <3

Is there a simple way to solve this if the dataframe is large and contains many illegal strings in 'Value'?

Thanks.

like image 565
lmart999 Avatar asked May 12 '14 06:05

lmart999


1 Answers

You could try using DataFrame's apply. Write a function that includes an exception handler and apply it to the DataFrame.

def test_apply(x):
    try:
        return float(x)
    except ValueError:
        return None

cleanDF = test['Value'].apply(test_apply).dropna()
like image 200
Phil Avatar answered Oct 21 '22 22:10

Phil