Python Pandas Unstacking Unique Column Values to Columns Of Their Own

Question

df = pd.DataFrame({'Col1': ['label1', 'label1', 'label2', 'label2',
          'label3', 'label3', 'label4'],
 'Col2': ['a', 'd', 'b', 'e', 'c', 'f', 'q']}, columns=['Col1', 'Col2'])

Looks like this

     Col1 Col2
0  label1    a
1  label1    d
2  label2    b
3  label2    e
4  label3    c
5  label3    f
6  label4    q

For the unique values in Col1, I want to turn a column's unique values into columns. In a sense, I'm trying to "unstack" Col1 values to be column headers and the row values would be the values in Col2. My key main issue is that I'm not computing any numeric data - it's all text - and I'm simply trying to reshape the structure.

Here's the desired result:

  label1 label2 label3 label4
0      a      b      c      q
1      d      e      f    NaN

I've tried: stack, unstack, pd.melt, pivot_table, pivot.

This ALMOST gets me there, but not quite, and doesn't seem very concise:

df.groupby('Col1').apply(lambda x: x['Col2'].values).to_frame().T

Col1  label1  label2  label3 label4
0     [a, d]  [b, e]  [c, f]    [q]

This question shows how to do it with a pivot table.. but the numeric index in my case is not something I care about.

This question shows how to also do it with a pivot table.. using aggfunc first or ' '.join but that returns CSV instead of values on respective rows.

jezrael · Accepted Answer

You can use cumcount for creating column for new index and then pivot_table with aggregating join:

df['g'] = df.groupby('Col1')['Col1'].cumcount()

print (df.pivot_table(index='g', columns='Col1', values='Col2', aggfunc=''.join))
Col1 label1 label2 label3 label4
g                               
0         a      b      c      q
1         d      e      f   None

Thank you for comment Jeff L.:

df['g'] = df.groupby('Col1')['Col1'].cumcount()

print (df.pivot(index='g', columns='Col1', values='Col2'))
Col1 label1 label2 label3 label4
g                               
0         a      b      c      q
1         d      e      f   None

Or:

print (pd.pivot(index=df.groupby('Col1')['Col1'].cumcount(),
                columns=df['Col1'], 
                values=df['Col2']))

Col1 label1 label2 label3 label4
0         a      b      c      q
1         d      e      f   None

Python Pandas Unstacking Unique Column Values to Columns Of Their Own

Tags:

python

text

pandas

grouping

Jarad

1 Answers

jezrael

Recent Activity

Donate For Us

Python Pandas Unstacking Unique Column Values to Columns Of Their Own

Tags:

python

text

pandas

grouping

Jarad

1 Answers

jezrael

Related questions

Recent Activity

Donate For Us