Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

smartest way to create pivoted multi-index dataframe out of a stacked dataframe

Tags:

python

pandas

So I have a very typical set of data that look like this:

data = {'date': {0: '10/02/2017',
  1: '10/02/2017',
  2: '10/02/2017',
  3: '10/02/2017',
  4: '10/02/2017'},
 'field': {0: 'field1', 1: 'field2', 2: 'field1', 3: 'field2', 4: 'field3'},
 'type': {0: 'type1', 1: 'type1', 2: 'type2', 3: 'type2', 4: 'type2'},
 'value': {0: 1.79067,
  1: 1.7987200000000001,
  2: 1.7978900000000002,
  3: 1.8001099999999999,
  4: 1.8045599999999999}}

df = pd.DataFrame(data)
df.date = pd.to_datetime(df.date)

this should look like this (real data set has many different dates):

        date   field   type    value
0 2017-10-02  field1  type1  1.79067
1 2017-10-02  field2  type1  1.79872
2 2017-10-02  field1  type2  1.79789
3 2017-10-02  field2  type2  1.80011
4 2017-10-02  field3  type2  1.80456

I want to create a pivoted multi-index dataframe such has:

              type1             type2                  
field        field1   field2   field1   field2   field3
date                                                   
2017-10-02  1.79067  1.79872  1.79789  1.80011  1.80456

the smartest way I found so far is this:

grouped = df.groupby('type')
res = {}
for name, df in grouped:
    res[name] = df.pivot(index='date', columns='field', values='value')
df = pd.concat(res, axis=1)

is there any other efficient way to achieve this?

thanks

like image 670
Steven G Avatar asked Dec 18 '22 03:12

Steven G


2 Answers

Option 1

By using unstack

In [36]: df.set_index(['date','field','type'])['value'].unstack([-1,-2])
Out[36]:
type          type1             type2
field        field1   field2   field1   field2   field3
date
2017-10-02  1.79067  1.79872  1.79789  1.80011  1.80456

Option2

pivot_table

pd.pivot_table(df,values='value',index='date',columns=['type','field'])
Out[464]: 
type          type1             type2                  
field        field1   field2   field1   field2   field3
date                                                   
2017-10-02  1.79067  1.79872  1.79789  1.80011  1.80456
like image 157
BENY Avatar answered Dec 20 '22 20:12

BENY


Option 1
Using defaultdict

from collections import defaultdict

d = defaultdict(lambda: defaultdict(dict))

for t in df.itertuples():
    d[('value', t.field, t.type)][t.date] = t.value

pd.DataFrame(d)

              value                                    
             field1            field2            field3
              type1    type2    type1    type2    type2
2017-10-02  1.79067  1.79789  1.79872  1.80011  1.80456

Option 2
Use groupby
This looks and feels a lot like @Wen's answer

df.groupby(['date', 'field', 'type']).first().unstack([-2, -1])

              value                                    
field        field1            field2            field3
type          type1    type2    type1    type2    type2
date                                                   
2017-10-02  1.79067  1.79789  1.79872  1.80011  1.80456
like image 25
piRSquared Avatar answered Dec 20 '22 21:12

piRSquared