I'm trying to create a column with values from one column, but based on matching another column with the previous value.
Here is my current code:
d = {'a':[1,2,3,1,2,3,2,1], 'b':[10,20,30,40,50,60,70,80]}
df = pd.DataFrame(d)
df['c'] = df['b'][df['a'] == df['a'].prev()]
And my desired output:
a b c
0 1 10 NaN
1 2 20 NaN
2 3 30 NaN
3 1 40 10
4 2 50 20
5 3 60 30
6 2 70 50
7 1 80 40
...which I'm not getting because .prev()
is not a real thing. Any thoughts?
Select & print last row of dataframe using tail() It will return the last row of dataframe as a dataframe object. Using the tail() function, we fetched the last row of dataframe as a dataframe and then just printed it.
df. iloc[-2] will get you the penultimate row info for all columns. Where df. shape[0] gets your row count, and -2 removes 2 from it to give you the index number for your penultimate row.
You can use the DataFrame. diff() function to find the difference between two rows in a pandas DataFrame. where: periods: The number of previous rows for calculating the difference.
We can group by a
column, which by default sorts values and then "attach" shifted b
column:
In [110]: df['c'] = df.groupby('a')['b'].transform(lambda x: x.shift())
In [111]: df
Out[111]:
a b c
0 1 10 NaN
1 2 20 NaN
2 3 30 NaN
3 1 40 10.0
4 2 50 20.0
5 3 60 30.0
6 2 70 50.0
7 1 80 40.0
Or much better option - using GroupBy.shift()
(thank you @Mitch)
In [114]: df['c'] = df.groupby('a')['b'].shift()
In [115]: df
Out[115]:
a b c
0 1 10 NaN
1 2 20 NaN
2 3 30 NaN
3 1 40 10.0
4 2 50 20.0
5 3 60 30.0
6 2 70 50.0
7 1 80 40.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With