Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Exception Handling in Pandas .apply() function

If I have a DataFrame:

myDF = DataFrame(data=[[11,11],[22,'2A'],[33,33]], columns = ['A','B']) 

Gives the following dataframe (Starting out on stackoverflow and don't have enough reputation for an image of the DataFrame)

   | A  | B  |  0  | 11 | 11 |  1  | 22 | 2A |  2  | 33 | 33 | 

If i want to convert column B to int values and drop values that can't be converted I have to do:

def convertToInt(cell):     try:         return int(cell)     except:         return None myDF['B'] = myDF['B'].apply(convertToInt) 

If I only do:

myDF['B'].apply(int)

the error obviously is:

C:\WinPython-32bit-2.7.5.3\python-2.7.5\lib\site-packages\pandas\lib.pyd in pandas.lib.map_infer (pandas\lib.c:42840)()

ValueError: invalid literal for int() with base 10: '2A'

Is there a way to add exception handling to myDF['B'].apply()

Thank you in advance!

like image 643
RukTech Avatar asked Apr 03 '14 19:04

RukTech


People also ask

What is Apply function in Pandas?

The apply() method allows you to apply a function along one of the axis of the DataFrame, default 0, which is the index (row) axis.

What is the use of apply () in Python explain in detail?

apply() method. This function acts as a map() function in Python. It takes a function as an input and applies this function to an entire DataFrame. If you are working with tabular data, you must specify an axis you want your function to act on ( 0 for columns; and 1 for rows).

Is Pandas apply in place?

No, the apply() method doesn't contain an inplace parameter, unlike these pandas methods which have an inplace parameter: df. drop()

What is the difference between apply and Applymap in Pandas?

apply() is used to apply a function along an axis of the DataFrame or on values of Series. applymap() is used to apply a function to a DataFrame elementwise.


1 Answers

I had the same question, but for a more general case where it was hard to tell if the function would generate an exception (i.e. you couldn't explicitly check this condition with something as straightforward as isdigit).

After thinking about it for a while, I came up with the solution of embedding the try/except syntax in a separate function. I'm posting a toy example in case it helps anyone.

import pandas as pd import numpy as np  x=pd.DataFrame(np.array([['a','a'], [1,2]]))  def augment(x):     try:         return int(x)+1     except:         return 'error:' + str(x)  x[0].apply(lambda x: augment(x)) 
like image 197
atkat12 Avatar answered Sep 19 '22 17:09

atkat12