Opposite of melt in python pandas

Tags:

I cannot figure out how to do "reverse melt" using Pandas in python. This is my starting data

import pandas as pd  from StringIO import StringIO  origin = pd.read_table(StringIO('''label    type    value x   a   1 x   b   2 x   c   3 y   a   4 y   b   5 y   c   6 z   a   7 z   b   8 z   c   9'''))  origin Out[5]:    label type  value 0     x    a      1 1     x    b      2 2     x    c      3 3     y    a      4 4     y    b      5 5     y    c      6 6     z    a      7 7     z    b      8 8     z    c      9

This is the output I would like to have:

    label   a   b   c         x   1   2   3         y   4   5   6         z   7   8   9

I'm sure there is an easy way to do this, but I don't know how.

995

asked Mar 02 '14 12:03

Boris Gorelik

2 Answers

there are a few ways;
using .pivot:

>>> origin.pivot(index='label', columns='type')['value'] type   a  b  c label          x      1  2  3 y      4  5  6 z      7  8  9  [3 rows x 3 columns]

using pivot_table:

>>> origin.pivot_table(values='value', index='label', columns='type')        value       type       a  b  c label              x          1  2  3 y          4  5  6 z          7  8  9  [3 rows x 3 columns]

or .groupby followed by .unstack:

>>> origin.groupby(['label', 'type'])['value'].aggregate('mean').unstack() type   a  b  c label          x      1  2  3 y      4  5  6 z      7  8  9  [3 rows x 3 columns]

172

answered Sep 21 '22 08:09

behzad.nouri

`DataFrame.set_index` + `DataFrame.unstack`

df.set_index(['label','type'])['value'].unstack()  type   a  b  c label          x      1  2  3 y      4  5  6 z      7  8  9

simplifying the passing of pivot arguments

df.pivot(*df)  type   a  b  c label          x      1  2  3 y      4  5  6 z      7  8  9

[*df] #['label', 'type', 'value']

For expected output we need DataFrame.reset_index and DataFrame.rename_axis

df.pivot(*df).rename_axis(columns = None).reset_index()    label  a  b  c 0     x  1  2  3 1     y  4  5  6 2     z  7  8  9

if there are duplicates in `a,b` columns we could lose information so we need `GroupBy.cumcount`

print(df)    label type  value 0     x    a      1 1     x    b      2 2     x    c      3 3     y    a      4 4     y    b      5 5     y    c      6 6     z    a      7 7     z    b      8 8     z    c      9 0     x    a      1 1     x    b      2 2     x    c      3 3     y    a      4 4     y    b      5 5     y    c      6 6     z    a      7 7     z    b      8 8     z    c      9

df.pivot_table(index = ['label',                         df.groupby(['label','type']).cumcount()],                columns = 'type',                values = 'value')   type     a  b  c label            x     0  1  2  3       1  1  2  3 y     0  4  5  6       1  4  5  6 z     0  7  8  9       1  7  8  9

Or:

(df.assign(type_2 = df.groupby(['label','type']).cumcount())    .set_index(['label','type','type_2'])['value']    .unstack('type'))

answered Sep 20 '22 08:09

ansev

Related questions
                            
                                Saving and loading multiple objects in pickle file?
                            
                                Mac OSX python ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:749)
                            
                                IPython and Jupyter autocomplete not working
                            
                                Find size and free space of the filesystem containing a given file
                            
                                Comma separated lists in django templates
                            
                                Easier way to enable verbose logging
                            
                                MySQL parameterized queries
                            
                                PDF Parsing Using Python - extracting formatted and plain texts [closed]
                            
                                Python packages and egg-info directories
                            
                                What's the difference between Python's subprocess.call and subprocess.run
                            
                                Virtual environment in R?
                            
                                How do I count the letters in Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch?
                            
                                Python time.sleep() vs event.wait()
                            
                                How do I debug efficiently with Spyder in Python?
                            
                                regex error - nothing to repeat
                            
                                Python functools.wraps equivalent for classes
                            
                                Why does next raise a 'StopIteration', but 'for' do a normal return?
                            
                                Efficient thresholding filter of an array with numpy
                            
                                set environment variable in python script
                            
                                What is the difference between pickle and shelve?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Opposite of melt in python pandas

Tags:

python

pandas

reshape

pivot

melt

Boris Gorelik

People also ask

2 Answers

behzad.nouri

`DataFrame.set_index` + `DataFrame.unstack`

if there are duplicates in `a,b` columns we could lose information so we need `GroupBy.cumcount`

ansev

Recent Activity

Donate For Us

Opposite of melt in python pandas

Tags:

python

pandas

reshape

pivot

melt

Boris Gorelik

People also ask

2 Answers

behzad.nouri

DataFrame.set_index + DataFrame.unstack

if there are duplicates in a,b columns we could lose information so we need GroupBy.cumcount

ansev

Related questions

Recent Activity

Donate For Us

`DataFrame.set_index` + `DataFrame.unstack`

if there are duplicates in `a,b` columns we could lose information so we need `GroupBy.cumcount`