I have a pandas dataframe with multiple columns and I want to "flatten" it to just two columns - one with column name and the other with values. E.g.
df1 = pd.DataFrame({'A':[1,2],'B':[2,3], 'C':[3,4]})
How can I convert it to look like:
df2 = pd.DataFrame({'column name': ['A','A','B','B','C','C'], 'value': [1,2,2,3,3,4]})
You can stack
to stack all column values into a single, column, then drop the first level index calling reset_index
, overwrite the column names with the ones you desire and then finally sort using sort_values
:
In [37]:
df2 = df1.stack().reset_index(level=0, drop=True).reset_index()
df2.columns = ['column name', 'value']
df2.sort_values(['column name', 'value'], inplace=True)
df2
Out[37]:
column name value
0 A 1
3 A 2
1 B 2
4 B 3
2 C 3
5 C 4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With