Error when trying to apply log method to pandas data frame column in Python

Tags:

So, I am very new to Python and Pandas (and programming in general), but am having trouble with a seemingly simple function. So I created the following dataframe using data pulled with a SQL query (if you need to see the SQL query, let me know and I'll paste it)

spydata = pd.DataFrame(row,columns=['date','ticker','close', 'iv1m', 'iv3m'])
tickerlist = unique(spydata[spydata['date'] == '2013-05-31'])

After that, I have written a function to create some new columns in the dataframe using the data already held in it:

def demean(arr):
    arr['retlog'] = log(arr['close']/arr['close'].shift(1))

    arr['10dvol'] = sqrt(252)*sqrt(pd.rolling_std(arr['ret'] , 10 ))  
    arr['60dvol'] = sqrt(252)*sqrt(pd.rolling_std(arr['ret'] , 10 ))  
    arr['90dvol'] = sqrt(252)*sqrt(pd.rolling_std(arr['ret'] , 10 ))  
    arr['1060rat'] = arr['10dvol']/arr['60dvol']
    arr['1090rat'] = arr['10dvol']/arr['90dvol']
    arr['60dis'] = (arr['1060rat'] - arr['1060rat'].mean())/arr['1060rat'].std()
    arr['90dis'] = (arr['1090rat'] - arr['1090rat'].mean())/arr['1090rat'].std()
    return arr

The only part that I'm having a problem with is the first line of the function:

arr['retlog'] = log(arr['close']/arr['close'].shift(1))

Which, when I run, with this command, I get an error:

result = spydata.groupby(['ticker']).apply(demean)

Error:

    ---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-196-4a66225e12ea> in <module>()
----> 1 result = spydata.groupby(['ticker']).apply(demean)
      2 results2 = result[result.date == result.date.max()]
      3 

C:\Python27\lib\site-packages\pandas-0.11.0-py2.7-win32.egg\pandas\core\groupby.pyc in apply(self, func, *args, **kwargs)
    323         func = _intercept_function(func)
    324         f = lambda g: func(g, *args, **kwargs)
--> 325         return self._python_apply_general(f)
    326 
    327     def _python_apply_general(self, f):

C:\Python27\lib\site-packages\pandas-0.11.0-py2.7-win32.egg\pandas\core\groupby.pyc in _python_apply_general(self, f)
    326 
    327     def _python_apply_general(self, f):
--> 328         keys, values, mutated = self.grouper.apply(f, self.obj, self.axis)
    329 
    330         return self._wrap_applied_output(keys, values,

C:\Python27\lib\site-packages\pandas-0.11.0-py2.7-win32.egg\pandas\core\groupby.pyc in apply(self, f, data, axis, keep_internal)
    632             # group might be modified
    633             group_axes = _get_axes(group)
--> 634             res = f(group)
    635             if not _is_indexed_like(res, group_axes):
    636                 mutated = True

C:\Python27\lib\site-packages\pandas-0.11.0-py2.7-win32.egg\pandas\core\groupby.pyc in <lambda>(g)
    322         """
    323         func = _intercept_function(func)
--> 324         f = lambda g: func(g, *args, **kwargs)
    325         return self._python_apply_general(f)
    326 

<ipython-input-195-47b6faa3f43c> in demean(arr)
      1 def demean(arr):
----> 2     arr['retlog'] = log(arr['close']/arr['close'].shift(1))
      3     arr['10dvol'] = sqrt(252)*sqrt(pd.rolling_std(arr['ret'] , 10 ))
      4     arr['60dvol'] = sqrt(252)*sqrt(pd.rolling_std(arr['ret'] , 10 ))
      5     arr['90dvol'] = sqrt(252)*sqrt(pd.rolling_std(arr['ret'] , 10 ))

AttributeError: log

I have tried changing the function to np.log as well as math.log, in which case I get the error

TypeError: only length-1 arrays can be converted to Python scalars

I've tried looking this up, but haven't found anything directly applicable. Any clues?

237

asked Jun 06 '13 17:06

user2460677

1 Answers

This happens when the datatype of the column is not numeric. Try

arr['retlog'] = log(arr['close'].astype('float64')/arr['close'].astype('float64').shift(1))

I suspect that the numbers are stored as generic 'object' types, which I know causes log to throw that error. Here is a simple illustration of the problem:

In [15]: np.log(Series([1,2,3,4], dtype='object'))
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-15-25deca6462b7> in <module>()
----> 1 np.log(Series([1,2,3,4], dtype='object'))

AttributeError: log

In [16]: np.log(Series([1,2,3,4], dtype='float64'))
Out[16]: 
0    0.000000
1    0.693147
2    1.098612
3    1.386294
dtype: float64

Your attempt with math.log did not work because that function is designed for single numbers (scalars) only, not lists or arrays.

For what it's worth, I think this is a confusing error message; it once stumped me for awhile, anyway. I wonder if it can be improved.

149

answered Oct 20 '22 06:10

Dan Allan

Related questions
                            
                                Per-cell output for threaded IPython Notebooks
                            
                                How can VIM tell the difference between `Ctrl-J` and `LF`?
                            
                                Missing file when installing pylinkgrammar
                            
                                How to encode a long in Base64 in Python?
                            
                                Python iterators. Initialize state variables in __init__ or __iter__?
                            
                                Aspect ratio in subplots with various y-axes
                            
                                Drawing floating numbers with [0, 1] from uniform distribution by using numpy
                            
                                How to get values from a "cell" of a "groupby" object?
                            
                                Django: get related set from a related set of a model
                            
                                How to execute process in Python where data is written to stdin?
                            
                                python matplotlib Agg vs. interactive plotting and tight_layout
                            
                                pythonic way to delete elements from a numpy array [duplicate]
                            
                                Reading piano notes on Python
                            
                                How to write inline latex code in IPython notebook
                            
                                Synchronous/Asynchronous behaviour of python Pipes
                            
                                equivalent of raw_input in Ipython notebook
                            
                                Legend using PathCollections in matplotlib
                            
                                neural networks regression using pybrain
                            
                                How To: Python Pandas get current stock data
                            
                                Why is matplotlib plot produced from ipython notebook slightly different from terminal version?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Error when trying to apply log method to pandas data frame column in Python

Tags:

python

pandas

dataframe

numpy

user2460677

People also ask

1 Answers

Dan Allan

Recent Activity

Donate For Us