I'm trying to create a column with values from one column, but based on matching another column with the previous value.
Here is my current code:
d = {'a':[1,2,3,1,2,3,2,1], 'b':[10,20,30,40,50,60,70,80]}
df = pd.DataFrame(d)
df['c'] = df['b'][df['a'] == df['a'].prev()]
And my desired output:
   a   b    c
0  1  10  NaN
1  2  20  NaN
2  3  30  NaN
3  1  40   10
4  2  50   20
5  3  60   30
6  2  70   50
7  1  80   40
...which I'm not getting because .prev() is not a real thing.  Any thoughts?
Select & print last row of dataframe using tail() It will return the last row of dataframe as a dataframe object. Using the tail() function, we fetched the last row of dataframe as a dataframe and then just printed it.
df. iloc[-2] will get you the penultimate row info for all columns. Where df. shape[0] gets your row count, and -2 removes 2 from it to give you the index number for your penultimate row.
You can use the DataFrame. diff() function to find the difference between two rows in a pandas DataFrame. where: periods: The number of previous rows for calculating the difference.
We can group by a column, which by default sorts values and then "attach" shifted b column:
In [110]: df['c'] = df.groupby('a')['b'].transform(lambda x: x.shift())
In [111]: df
Out[111]:
   a   b     c
0  1  10   NaN
1  2  20   NaN
2  3  30   NaN
3  1  40  10.0
4  2  50  20.0
5  3  60  30.0
6  2  70  50.0
7  1  80  40.0
Or much better option - using GroupBy.shift() (thank you @Mitch)
In [114]: df['c'] = df.groupby('a')['b'].shift()
In [115]: df
Out[115]:
   a   b     c
0  1  10   NaN
1  2  20   NaN
2  3  30   NaN
3  1  40  10.0
4  2  50  20.0
5  3  60  30.0
6  2  70  50.0
7  1  80  40.0
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With