I have the following dataframe:
pp b pp b
5 0.001464 6 0.001853
5 0.001459 6 0.001843
Is there a way to unpivot columns with the same name into multiple rows?
This is the required output:
pp b
5 0.001464
5 0.001459
6 0.001853
6 0.001843
Try groupby
with axis=1
df.groupby(df.columns.values, axis=1).agg(lambda x: x.values.tolist()).sum().apply(pd.Series).T.sort_values('pp')
Out[320]:
b pp
0 0.001464 5.0
2 0.001459 5.0
1 0.001853 6.0
3 0.001843 6.0
A fun way with wide_to_long
s=pd.Series(df.columns)
df.columns=df.columns+s.groupby(s).cumcount().astype(str)
pd.wide_to_long(df.reset_index(),stubnames=['pp','b'],i='index',j='drop',suffix='\d+')
Out[342]:
pp b
index drop
0 0 5 0.001464
1 0 5 0.001459
0 1 6 0.001853
1 1 6 0.001843
This is possible using numpy
:
res = pd.DataFrame({'pp': df['pp'].values.T.ravel(),
'b': df['b'].values.T.ravel()})
print(res)
b pp
0 0.001464 5
1 0.001459 5
2 0.001853 6
3 0.001843 6
Or without referencing specific columns explicitly:
res = pd.DataFrame({i: df[i].values.T.ravel() for i in set(df.columns)})
Let's use melt, cumcount and unstack:
dm = df.melt()
dm.set_index(['variable',dm.groupby('variable').cumcount()])\
.sort_index()['value'].unstack(0)
Output:
variable b pp
0 0.001464 5.0
1 0.001459 5.0
2 0.001853 6.0
3 0.001843 6.0
I'm a little bit surprise that nobody has mentioned so far the use of pd.concat... Take a look below:
df1 = pd.DataFrame({'Col1':[1,2,3,4], 'Col2':[5,6,7,8]})
df1
Col1 Col2
0 1 5
1 2 6
2 3 7
3 4 8
Now if you make:
df2 = pd.concat([df1,df1])
you get:
Col1 Col2
0 1 5
1 2 6
2 3 7
3 4 8
0 1 5
1 2 6
2 3 7
3 4 8
This is what you wanted, isn't?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With