Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Opposite of melt in python pandas

I cannot figure out how to do "reverse melt" using Pandas in python. This is my starting data

import pandas as pd  from StringIO import StringIO  origin = pd.read_table(StringIO('''label    type    value x   a   1 x   b   2 x   c   3 y   a   4 y   b   5 y   c   6 z   a   7 z   b   8 z   c   9'''))  origin Out[5]:    label type  value 0     x    a      1 1     x    b      2 2     x    c      3 3     y    a      4 4     y    b      5 5     y    c      6 6     z    a      7 7     z    b      8 8     z    c      9 

This is the output I would like to have:

    label   a   b   c         x   1   2   3         y   4   5   6         z   7   8   9 

I'm sure there is an easy way to do this, but I don't know how.

like image 995
Boris Gorelik Avatar asked Mar 02 '14 12:03

Boris Gorelik


People also ask

What is melt () in Python?

melt() function is useful to message a DataFrame into a format where one or more columns are identifier variables, while all other columns, considered measured variables, are unpivoted to the row axis, leaving just two non-identifier columns, variable and value.

What does melt mean in pandas?

Pandas melt() function is used to change the DataFrame format from wide to long. It's used to create a specific format of the DataFrame object where one or more columns work as identifiers. All the remaining columns are treated as values and unpivoted to the row axis and only two columns - variable and value.

How do you flatten in pandas?

The first method to flatten the pandas dataframe is through NumPy python package. There is a function in NumPy that is numpy. flatten() that perform this task. First, you have to convert the dataframe to numpy using the to_numpy() method and then apply the flatten() method.


2 Answers

there are a few ways;
using .pivot:

>>> origin.pivot(index='label', columns='type')['value'] type   a  b  c label          x      1  2  3 y      4  5  6 z      7  8  9  [3 rows x 3 columns] 

using pivot_table:

>>> origin.pivot_table(values='value', index='label', columns='type')        value       type       a  b  c label              x          1  2  3 y          4  5  6 z          7  8  9  [3 rows x 3 columns] 

or .groupby followed by .unstack:

>>> origin.groupby(['label', 'type'])['value'].aggregate('mean').unstack() type   a  b  c label          x      1  2  3 y      4  5  6 z      7  8  9  [3 rows x 3 columns] 
like image 172
behzad.nouri Avatar answered Sep 21 '22 08:09

behzad.nouri


DataFrame.set_index + DataFrame.unstack

df.set_index(['label','type'])['value'].unstack()  type   a  b  c label          x      1  2  3 y      4  5  6 z      7  8  9 

simplifying the passing of pivot arguments

df.pivot(*df)  type   a  b  c label          x      1  2  3 y      4  5  6 z      7  8  9 

[*df] #['label', 'type', 'value'] 

For expected output we need DataFrame.reset_index and DataFrame.rename_axis

df.pivot(*df).rename_axis(columns = None).reset_index()    label  a  b  c 0     x  1  2  3 1     y  4  5  6 2     z  7  8  9 

if there are duplicates in a,b columns we could lose information so we need GroupBy.cumcount

print(df)    label type  value 0     x    a      1 1     x    b      2 2     x    c      3 3     y    a      4 4     y    b      5 5     y    c      6 6     z    a      7 7     z    b      8 8     z    c      9 0     x    a      1 1     x    b      2 2     x    c      3 3     y    a      4 4     y    b      5 5     y    c      6 6     z    a      7 7     z    b      8 8     z    c      9 

df.pivot_table(index = ['label',                         df.groupby(['label','type']).cumcount()],                columns = 'type',                values = 'value')   type     a  b  c label            x     0  1  2  3       1  1  2  3 y     0  4  5  6       1  4  5  6 z     0  7  8  9       1  7  8  9 

Or:

(df.assign(type_2 = df.groupby(['label','type']).cumcount())    .set_index(['label','type','type_2'])['value']    .unstack('type')) 
like image 45
ansev Avatar answered Sep 20 '22 08:09

ansev