Reshaping/Pivoting Data with Date Value

Question

I need to pivot/reshape long form data 2 ways: 1) adding date columns(End-of_month) and filling in numeric value (total) 2) adding date columns(End-of_month) and filling in date value(day-of-month that reached the 'total' value in previous pivot)

I can do 1 with:

data = pd.DataFrame({'date': ['1-12-2016', '1-23-2016', '2-23-2016', '2-1-2016', '3-4-2016'],
        'EOM': ['1-31-2016', '1-31-2016', '2-28-2016', '2-28-2016', '3-31-2016'],
        'country':['uk', 'usa', 'fr','fr','uk'],
        'tr_code': [10, 21, 20, 10,12],
        'TOTAL': [435, 367,891,1234,231]
        })

data['EOM'] = pd.to_datetime(data['EOM'])
data['date'] = pd.to_datetime(data['date'])


data_total = data.pivot_table(values='TOTAL', index=['country','tr_code'], columns='EOM')

Out[73]: 
EOM              2016-01-31  2016-02-28  2016-03-31
country tr_code                                    
fr      10              NaN      1234.0         NaN
        20              NaN       891.0         NaN
uk      10            435.0         NaN         NaN
        12              NaN         NaN       231.0
usa     21            367.0         NaN         NaN

However, trying to change value argument with 'date' produces: DataError: No numeric types to aggregate

I basically want two df's - the one I accomplished, and another in the same format , but instead of the 'TOTAL' value the 'date' in which that total was accomplished.

Any help is greatly appreciated.

piRSquared · Accepted Answer

`set_index` with `unstack`

This assumes the combinations of ['country', 'tr_code', 'EOM'] are unique and will break if they are not. This is why an aggregation function is important. We need a rule if and when we get multiple observations of a combination.

data.set_index(['country', 'tr_code', 'EOM']).date.unstack()

EOM             2016-01-31 2016-02-28 2016-03-31
country tr_code                                 
fr      10             NaT 2016-02-01        NaT
        20             NaT 2016-02-23        NaT
uk      10      2016-01-12        NaT        NaT
        12             NaT        NaT 2016-03-04
usa     21      2016-01-23        NaT        NaT

`aggfunc` / `pivot_table`

The default aggregation function is mean and that makes no sense for dates. first will do. Could also have used last which ALollz had used in their deleted answer.

data.pivot_table(
    values='date', index=['country', 'tr_code'], columns='EOM', aggfunc='first')

EOM             2016-01-31 2016-02-28 2016-03-31
country tr_code                                 
fr      10             NaT 2016-02-01        NaT
        20             NaT 2016-02-23        NaT
uk      10      2016-01-12        NaT        NaT
        12             NaT        NaT 2016-03-04
usa     21      2016-01-23        NaT        NaT

`groupby`

Less glamorous way of doing the same thing as pivot_table

data.groupby(['country', 'tr_code', 'EOM']).date.first().unstack()

EOM             2016-01-31 2016-02-28 2016-03-31
country tr_code                                 
fr      10             NaT 2016-02-01        NaT
        20             NaT 2016-02-23        NaT
uk      10      2016-01-12        NaT        NaT
        12             NaT        NaT 2016-03-04
usa     21      2016-01-23        NaT        NaT

Reshaping/Pivoting Data with Date Value

Tags:

python-3.x

pandas

HowdyDude

1 Answers

`set_index` with `unstack`

`aggfunc` / `pivot_table`

`groupby`

piRSquared

Recent Activity

Donate For Us

Reshaping/Pivoting Data with Date Value

Tags:

python-3.x

pandas

HowdyDude

1 Answers

set_index with unstack

aggfunc / pivot_table

groupby

piRSquared

Related questions

Recent Activity

Donate For Us

`set_index` with `unstack`

`aggfunc` / `pivot_table`

`groupby`