Let's suppose that I have the following in python and pandas:
Names Values
0 A 6
1 B 8
2 C 3
3 D 5
4 E 1
5 F 3
6 D 9
7 E 6
8 F 4
9 G 3
10 D 1
11 E 5
12 F 6
and I want to transform this to the following:
Name_1 Values_1 Name_2 Values_2 Name_3 Values_3
0 D 5 D 9 D 1
1 E 1 E 6 E 5
2 F 3 F 4 F 6
3 NA NA G 3 NA NA
Basically what I want to do is to split the original pairs of columns in chunks of data which start with the row where Names is D and ends with the row exactly before the next Names is D starts.
What is the most efficient way to do this?
Idea is create MultiIndex by Series.cumsum with compared Series with GroupBy.cumcount, then reshape by DataFrame.unstack:
b = df['Names'].eq('D').cumsum()
a = df.groupby(b).cumcount()
df1 = df.set_index([a, b]).unstack().drop(0, axis=1, level=1).sort_index(axis=1, level=1)
df1.columns = [f'{x}_{y}' for x, y in df1.columns]
print (df1)
Names_1 Values_1 Names_2 Values_2 Names_3 Values_3
0 D 5.0 D 9.0 D 1.0
1 E 1.0 E 6.0 E 5.0
2 F 3.0 F 4.0 F 6.0
3 NaN NaN G 3.0 NaN NaN
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With