Is there a way to look back to a previous row, and calculate a new variable? so as long as the previous row is the same case what is the (previous change) - (current change), and attribute it to the previous 'ChangeEvent' in new columns? here is my DataFrame <pre class="prettyprint"><code>>>> df ChangeEvent StartEvent case change open 0 Homeless Homeless 1 2014-03-08 00:00:00 2014-02-08 1 other Homeless 1 2014-04-08 00:00:00 2014-02-08 2 Homeless Homeless 1 2014-05-08 00:00:00 2014-02-08 3 Jail Homeless 1 2014-06-08 00:00:00 2014-02-08 4 Jail Jail 2 2014-06-08 00:00:00 2014-02-08 </code></pre> to add columns <pre class="prettyprint"><code>Jail Homeless case 0 6 1 0 30 1 0 0 1 </code></pre> ... and so on here is the df build <pre class="prettyprint"><code>import pandas as pd import datetime as DT d = {'case' : pd.Series([1,1,1,1,2]), 'open' : pd.Series([DT.datetime(2014, 3, 2), DT.datetime(2014, 3, 2),DT.datetime(2014, 3, 2),DT.datetime(2014, 3, 2),DT.datetime(2014, 3, 2)]), 'change' : pd.Series([DT.datetime(2014, 3, 8), DT.datetime(2014, 4, 8),DT.datetime(2014, 5, 8),DT.datetime(2014, 6, 8),DT.datetime(2014, 6, 8)]), 'StartEvent' : pd.Series(['Homeless','Homeless','Homeless','Homeless','Jail']), 'ChangeEvent' : pd.Series(['Homeless','irrelivant','Homeless','Jail','Jail']), 'close' : pd.Series([DT.datetime(2015, 3, 2), DT.datetime(2015, 3, 2),DT.datetime(2015, 3, 2),DT.datetime(2015, 3, 2),DT.datetime(2015, 3, 2)])} df=pd.DataFrame(d) </code></pre>

The way to get the previous is using the shift method: <pre class="prettyprint"><code>In [11]: df1.change.shift(1) Out[11]: 0 NaT 1 2014-03-08 2 2014-04-08 3 2014-05-08 4 2014-06-08 Name: change, dtype: datetime64[ns] </code></pre> Now you can subtract these columns. Note: This is with 0.13.1 (datetime stuff has had a lot of work recently, so YMMV with older versions). <pre class="prettyprint"><code>In [12]: df1.change.shift(1) - df1.change Out[12]: 0 NaT 1 -31 days 2 -30 days 3 -31 days 4 0 days Name: change, dtype: timedelta64[ns] </code></pre> You can just apply this to each case/group: <pre class="prettyprint"><code>In [13]: df.groupby('case')['change'].apply(lambda x: x.shift(1) - x) Out[13]: 0 NaT 1 -31 days 2 -30 days 3 -31 days 4 NaT dtype: timedelta64[ns] </code></pre>

get previous row's value and calculate new column pandas python

Tags:

python

pandas

Is there a way to look back to a previous row, and calculate a new variable? so as long as the previous row is the same case what is the (previous change) - (current change), and attribute it to the previous 'ChangeEvent' in new columns?

here is my DataFrame

>>> df   ChangeEvent StartEvent  case              change      open   0    Homeless   Homeless     1 2014-03-08 00:00:00 2014-02-08   1       other   Homeless     1 2014-04-08 00:00:00 2014-02-08      2    Homeless   Homeless     1 2014-05-08 00:00:00 2014-02-08       3        Jail   Homeless     1 2014-06-08 00:00:00 2014-02-08      4        Jail       Jail     2 2014-06-08 00:00:00 2014-02-08

to add columns

Jail  Homeless case  0    6        1  0    30       1  0    0        1

... and so on

here is the df build

import pandas as pd import datetime as DT d = {'case' : pd.Series([1,1,1,1,2]), 'open' : pd.Series([DT.datetime(2014, 3, 2), DT.datetime(2014, 3, 2),DT.datetime(2014, 3, 2),DT.datetime(2014, 3, 2),DT.datetime(2014, 3, 2)]), 'change' : pd.Series([DT.datetime(2014, 3, 8), DT.datetime(2014, 4, 8),DT.datetime(2014, 5, 8),DT.datetime(2014, 6, 8),DT.datetime(2014, 6, 8)]), 'StartEvent' : pd.Series(['Homeless','Homeless','Homeless','Homeless','Jail']), 'ChangeEvent' : pd.Series(['Homeless','irrelivant','Homeless','Jail','Jail']), 'close' : pd.Series([DT.datetime(2015, 3, 2), DT.datetime(2015, 3, 2),DT.datetime(2015, 3, 2),DT.datetime(2015, 3, 2),DT.datetime(2015, 3, 2)])} df=pd.DataFrame(d)

418

asked Feb 27 '14 22:02

Chet Meinzer

1 Answers

The way to get the previous is using the shift method:

In [11]: df1.change.shift(1) Out[11]: 0          NaT 1   2014-03-08 2   2014-04-08 3   2014-05-08 4   2014-06-08 Name: change, dtype: datetime64[ns]

Now you can subtract these columns. Note: This is with 0.13.1 (datetime stuff has had a lot of work recently, so YMMV with older versions).

In [12]: df1.change.shift(1) - df1.change Out[12]: 0        NaT 1   -31 days 2   -30 days 3   -31 days 4     0 days Name: change, dtype: timedelta64[ns]

You can just apply this to each case/group:

In [13]: df.groupby('case')['change'].apply(lambda x: x.shift(1) - x) Out[13]: 0        NaT 1   -31 days 2   -30 days 3   -31 days 4        NaT dtype: timedelta64[ns]

130

answered Oct 15 '22 15:10

Andy Hayden

Related questions
                            
                                What does sudo -H do?
                            
                                What causes "indexing past lexsort depth" warning in Pandas?
                            
                                Passing double quote shell commands in python to subprocess.Popen()?
                            
                                Deprecation status of the NumPy matrix class
                            
                                Python: if not val, vs if val is None
                            
                                Why is matrix multiplication faster with numpy than with ctypes in Python?
                            
                                Read Bash variables into a Python script
                            
                                How to include a local table of contents into Sphinx doc?
                            
                                How to plot the lines first and points last in matplotlib
                            
                                Pandas Dataframe: split column into multiple columns, right-align inconsistent cell entries
                            
                                Python regex to match dates
                            
                                Matplotlib plots aren't shown when running file from bash terminal
                            
                                How can I auto-populate a PDF form in Django/Python? [closed]
                            
                                How do I import/add an existing Python file to a PyCharm project?
                            
                                Python ternary operator [duplicate]
                            
                                Can I access ImageMagick API with Python?
                            
                                How can I show figures separately in matplotlib?
                            
                                How to combine python asyncio with threads?
                            
                                What is the difference between !r and %r in Python?
                            
                                How to get top-level protobuf enum value name by number in python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With