Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas - multiple columns to "column name - value" columns

I have a pandas dataframe with multiple columns and I want to "flatten" it to just two columns - one with column name and the other with values. E.g.

df1 = pd.DataFrame({'A':[1,2],'B':[2,3], 'C':[3,4]})

How can I convert it to look like:

df2 = pd.DataFrame({'column name': ['A','A','B','B','C','C'], 'value': [1,2,2,3,3,4]})
like image 804
MJB Avatar asked Sep 11 '25 05:09

MJB


1 Answers

You can stack to stack all column values into a single, column, then drop the first level index calling reset_index, overwrite the column names with the ones you desire and then finally sort using sort_values:

In [37]:
df2 = df1.stack().reset_index(level=0, drop=True).reset_index()
df2.columns = ['column name', 'value']
df2.sort_values(['column name', 'value'], inplace=True)
df2

Out[37]:
  column name  value
0           A      1
3           A      2
1           B      2
4           B      3
2           C      3
5           C      4
like image 194
EdChum Avatar answered Sep 13 '25 18:09

EdChum