I have a pandas dataframe and I'd like to add a new column that has the contents of an existing column, but shifted relative to the rest of the data frame. I'd also like the value that drops off the bottom to get rolled around to the top. For example if this is my dataframe: <pre class="prettyprint"><code>>>> myDF coord coverage 0 1 1 1 2 10 2 3 50 </code></pre> I want to get this: <pre class="prettyprint"><code>>>> myDF_shifted coord coverage coverage_shifted 0 1 1 50 1 2 10 1 2 3 50 10 </code></pre> (This is just a simplified example - in real life, my dataframes are larger and I will need to shift by more than one unit) This is what I've tried and what I get back: <pre class="prettyprint"><code>>>> myDF['coverage_shifted'] = myDF.coverage.shift(1) >>> myDF coord coverage coverage_shifted 0 1 1 NaN 1 2 10 1 2 3 50 10 </code></pre> So I can create the shifted column, but I don't know how to roll the bottom value around to the top. From internet searches I think that numpy lets you do this with "numpy.roll". Is there a pandas equivalent?

Pandas probably doesn't provide an off-the-shelf method to do the exactly what you described, however if you can move a little but out of the box, <code>numpy</code> has exactly that In your case it is: <pre class="prettyprint"><code>import numpy as np myDF['coverage_shifted'] = np.roll(df.coverage, 2) </code></pre>

pandas equivalent to numpy.roll

Tags:

python

pandas

numpy

I have a pandas dataframe and I'd like to add a new column that has the contents of an existing column, but shifted relative to the rest of the data frame. I'd also like the value that drops off the bottom to get rolled around to the top.

For example if this is my dataframe:

>>> myDF
   coord  coverage
0      1         1
1      2        10
2      3        50

I want to get this:

>>> myDF_shifted
   coord  coverage  coverage_shifted
0      1         1                50
1      2        10                 1
2      3        50                10

(This is just a simplified example - in real life, my dataframes are larger and I will need to shift by more than one unit)

This is what I've tried and what I get back:

>>> myDF['coverage_shifted'] = myDF.coverage.shift(1)
>>> myDF
   coord  coverage  coverage_shifted
0      1         1               NaN
1      2        10                 1
2      3        50                10

So I can create the shifted column, but I don't know how to roll the bottom value around to the top. From internet searches I think that numpy lets you do this with "numpy.roll". Is there a pandas equivalent?

800

asked Aug 27 '15 18:08

Scarlet

1 Answers

Pandas probably doesn't provide an off-the-shelf method to do the exactly what you described, however if you can move a little but out of the box, numpy has exactly that

In your case it is:

import numpy as np
myDF['coverage_shifted'] = np.roll(df.coverage, 2)

answered Sep 23 '22 03:09

CT Zhu

Related questions
                            
                                Sending ASCII Command using PySerial
                            
                                How does Python access OS API functions such as socket()?
                            
                                How can I read successive arrays from a binary file using `np.fromfile`?
                            
                                When it is necessary to close a file and when it is not in python?
                            
                                How to use Event objects in python with a thread?
                            
                                Any pyinstaller detailed example about hidden import for psutil?
                            
                                Python Mechanize log into Facebook cookie error
                            
                                Python Selenium create loop to click through links on a page and press button on each new page
                            
                                Why does this Python script run 4x slower on multiple cores than on a single core
                            
                                ImportError: No module named pkg_resources on installing matplotlib
                            
                                Generate PDF from HTML using Django and Reportlab
                            
                                Weird error with Redis and Celery
                            
                                imresize error using OpenCV 2.4.10 and Python 2.7.10
                            
                                Why is there strange behavior between the IDs of equivalent strings?
                            
                                pyparsing: named results?
                            
                                how do i validate xml against dtd using python?
                            
                                Want to do multi-variation minimize with sympy
                            
                                What's the logic behind this particular Python functions composition?
                            
                                Authenticate to JIRA with gmail in python
                            
                                Filter a numpy array based on largest value

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With