df = pd.DataFrame({'Col1': ['label1', 'label1', 'label2', 'label2',
'label3', 'label3', 'label4'],
'Col2': ['a', 'd', 'b', 'e', 'c', 'f', 'q']}, columns=['Col1', 'Col2'])
Looks like this
Col1 Col2
0 label1 a
1 label1 d
2 label2 b
3 label2 e
4 label3 c
5 label3 f
6 label4 q
For the unique values in Col1
, I want to turn a column's unique values into columns. In a sense, I'm trying to "unstack" Col1
values to be column headers and the row values would be the values in Col2
. My key main issue is that I'm not computing any numeric data - it's all text - and I'm simply trying to reshape the structure.
Here's the desired result:
label1 label2 label3 label4
0 a b c q
1 d e f NaN
I've tried: stack
, unstack
, pd.melt
, pivot_table
, pivot
.
This ALMOST gets me there, but not quite, and doesn't seem very concise:
df.groupby('Col1').apply(lambda x: x['Col2'].values).to_frame().T
Col1 label1 label2 label3 label4
0 [a, d] [b, e] [c, f] [q]
This question shows how to do it with a pivot table.. but the numeric index in my case is not something I care about.
This question shows how to also do it with a pivot table.. using aggfunc first
or ' '.join
but that returns CSV instead of values on respective rows.
You can use cumcount
for creating column for new index
and then pivot_table
with aggregating join
:
df['g'] = df.groupby('Col1')['Col1'].cumcount()
print (df.pivot_table(index='g', columns='Col1', values='Col2', aggfunc=''.join))
Col1 label1 label2 label3 label4
g
0 a b c q
1 d e f None
Thank you for comment Jeff L.
:
df['g'] = df.groupby('Col1')['Col1'].cumcount()
print (df.pivot(index='g', columns='Col1', values='Col2'))
Col1 label1 label2 label3 label4
g
0 a b c q
1 d e f None
Or:
print (pd.pivot(index=df.groupby('Col1')['Col1'].cumcount(),
columns=df['Col1'],
values=df['Col2']))
Col1 label1 label2 label3 label4
0 a b c q
1 d e f None
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With