Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas - 'Series' object has no attribute 'colNames' when using apply()

Tags:

python

pandas

I need to use a lambda function to do a row by row computation. For example create some dataframe

import pandas as pd
import numpy as np

def myfunc(x, y):
    return x + y

colNames = ['A', 'B']
data = np.array([np.arange(10)]*2).T

df = pd.DataFrame(data, index=range(0, 10), columns=colNames)

using 'myfunc' this does work

df['D'] = (df.apply(lambda x: myfunc(x.A, x.B), axis=1))

but this second case does not work!

df['D'] = (df.apply(lambda x: myfunc(x.colNames[0], x.colNames[1]), axis=1))

giving the error

AttributeError: ("'Series' object has no attribute 'colNames'", u'occurred at index 0')

I really need to use the second case (access the colNames using the list) which gives an error, any clues on how to do this?

like image 271
Runner Bean Avatar asked Nov 09 '16 11:11

Runner Bean


1 Answers

When you use df.apply(), each row of your DataFrame will be passed to your lambda function as a pandas Series. The frame's columns will then be the index of the series and you can access values using series[label].

So this should work:

df['D'] = (df.apply(lambda x: myfunc(x[colNames[0]], x[colNames[1]]), axis=1)) 
like image 116
foglerit Avatar answered Oct 17 '22 08:10

foglerit