Diff on pandas dataframe with more than one column

Tags:

pandas

I have a pandas dataframe with two columns:

ddf.head()

    a    b
0   3136 13280
1   3072 13312
2   3152 13296
3   3120 13248
4   3120 13200

I would like to calculate the difference between consecutive elements in the same column. Now, if I do it for one column at a time (ddf['a'].diff()) it works as I expect, but if I try ddf.diff() it gives:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-68-6ff864856571> in <module>()
----> 1 ddf.diff()

/home/app/anaconda/lib/python2.7/site-packages/pandas/core/frame.pyc in diff(self, periods)
   4285         diffed : DataFrame
   4286         """
-> 4287         new_data = self._data.diff(periods)
   4288         return self._constructor(new_data)
   4289 

/home/app/anaconda/lib/python2.7/site-packages/pandas/core/internals.pyc in diff(self, *args, **kwargs)
   1287 
   1288     def diff(self, *args, **kwargs):
-> 1289         return self.apply('diff', *args, **kwargs)
   1290 
   1291     def interpolate(self, *args, **kwargs):

/home/app/anaconda/lib/python2.7/site-packages/pandas/core/internals.pyc in apply(self, f, *args, **kwargs)
   1267                 applied = f(blk, *args, **kwargs)
   1268             else:
-> 1269                 applied = getattr(blk,f)(*args, **kwargs)
   1270 
   1271             if isinstance(applied,list):

/home/app/anaconda/lib/python2.7/site-packages/pandas/core/internals.pyc in diff(self, n)
    423     def diff(self, n):
    424         """ return block for the diff of the values """
--> 425         new_values = com.diff(self.values, n, axis=1)
    426         return make_block(new_values, self.items, self.ref_items, fastpath=True)
    427 

/home/app/anaconda/lib/python2.7/site-packages/pandas/core/common.pyc in diff(arr, n, axis)
    643     if arr.ndim == 2 and arr.dtype.name in _diff_special:
    644         f = _diff_special[arr.dtype.name]
--> 645         f(arr, out_arr, n, axis)
    646     else:
    647         res_indexer = [slice(None)] * arr.ndim

/home/app/anaconda/lib/python2.7/site-packages/pandas/algos.so in pandas.algos.diff_2d_int16 (pandas/algos.c:91446)()

ValueError: Buffer dtype mismatch, expected 'float32_t' but got 'double'

630

asked Nov 12 '13 21:11

Fra

1 Answers

You can use this:

>>> df - df.shift(1)
    a   b
0 NaN NaN
1 -64  32
2  80 -16
3 -32 -48
4   0 -48

But actually, at my machine, df.diff() works ok:

>>> df.diff()
    a   b
0 NaN NaN
1 -64  32
2  80 -16
3 -32 -48
4   0 -48

148

answered Sep 30 '22 11:09

Roman Pekar

Related questions
                            
                                Celery-Django: Celery vs django management commands
                            
                                MITMProxy poor performance
                            
                                Can you search backwards from an offset using a Python regular expression?
                            
                                Tkinter/Matplotlib backend conflict causes infinite mainloop
                            
                                Saving a numpy array with mixed data
                            
                                Python: Import file in grandparent directory
                            
                                How to catch an Exception like this on Flask?
                            
                                Push a file to Heroku that's not in my git repo.
                            
                                matplotlib contour plot with lognorm - colorbar levels
                            
                                Python recursive setattr()-like function for working with nested dictionaries
                            
                                Slice pandas series with elements not in the index
                            
                                Writing a simple function using while
                            
                                Looking to quantify the performance overhead of NewRelic monitoring in python django app
                            
                                How to sort a boxplot by the median values in pandas
                            
                                How to create a complete menu using GIO Actions in PyGI GTK?
                            
                                Keeping to 79 char line limit in Python with multiple indents
                            
                                Parsing nested JSON data
                            
                                When is a context manager's __exit__ triggered when inside a generator?
                            
                                Django allowed hosts with port number
                            
                                Django: where do I call settings.configure?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Diff on pandas dataframe with more than one column

Tags:

python

pandas

Fra

People also ask

1 Answers

Roman Pekar

Recent Activity

Donate For Us