I have a dataframe with two columns as below:
Var1Var2
a 28
b 28
d 28
f 29
f 29
e 30
b 30
m 30
l 30
u 31
t 31
t 31
I'd like to create a third column with values which increases by one for every change in value of another column.
Var1Var2Var3
a 28 1
b 28 1
d 28 1
f 29 2
f 29 2
e 30 3
b 30 3
m 30 3
l 30 3
u 31 4
t 31 4
t 31 4
How would I go about doing this?
You can compare Var2
with its shifted-by-1 version:
v
Var1 Var2
a 0 28
b 1 28
d 2 28
f 3 30
f 4 30
e 5 2
b 6 2
m 7 2
l 8 2
u 9 5
t 10 5
t 11 5
i = v.Var2
v['Var3'] = i.ne(i.shift()).cumsum()
v
Var1 Var2 Var3
a 0 28 1
b 1 28 1
d 2 28 1
f 3 30 2
f 4 30 2
e 5 2 3
b 6 2 3
m 7 2 3
l 8 2 3
u 9 5 4
t 10 5 4
t 11 5 4
Using category
df.Var2.astype('category').cat.codes.add(1)
Out[525]:
0 1
1 1
2 1
3 2
4 2
5 3
6 3
7 3
8 3
9 4
10 4
11 4
dtype: int8
Updated
from itertools import groupby
grouped = [list(g) for k, g in groupby(df.Var2.tolist())]
np.repeat(range(len(grouped)),[len(x) for x in grouped])+1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With