Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas DataFrame stack multiple column values into single column

Assuming the following DataFrame:

  key.0 key.1 key.2  topic 1   abc   def   ghi      8 2   xab   xcd   xef      9 

How can I combine the values of all the key.* columns into a single column 'key', that's associated with the topic value corresponding to the key.* columns? This is the result I want:

   topic  key 1      8  abc 2      8  def 3      8  ghi 4      9  xab 5      9  xcd 6      9  xef 

Note that the number of key.N columns is variable on some external N.

like image 999
borice Avatar asked Dec 19 '15 22:12

borice


People also ask

How do I stack columns in pandas?

Pandas DataFrame: stack() functionThe stack() function is used to stack the prescribed level(s) from columns to index. Return a reshaped DataFrame or Series having a multi-level index with one or more new inner-most levels compared to the current DataFrame.

What is the use of stack () and unstack () method in pandas?

Pandas provides various built-in methods for reshaping DataFrame. Among them, stack() and unstack() are the 2 most popular methods for restructuring columns and rows (also known as index). stack() : stack the prescribed level(s) from column to row. unstack() : unstack the prescribed level(s) from row to column.


1 Answers

You can melt your dataframe:

>>> keys = [c for c in df if c.startswith('key.')] >>> pd.melt(df, id_vars='topic', value_vars=keys, value_name='key')     topic variable  key 0      8    key.0  abc 1      9    key.0  xab 2      8    key.1  def 3      9    key.1  xcd 4      8    key.2  ghi 5      9    key.2  xef 

It also gives you the source of the key.


From v0.20, melt is a first class function of the pd.DataFrame class:

>>> df.melt('topic', value_name='key').drop('variable', 1)     topic  key 0      8  abc 1      9  xab 2      8  def 3      9  xcd 4      8  ghi 5      9  xef 
like image 129
Alexander Avatar answered Sep 21 '22 19:09

Alexander