I have a timeseries dataframe that is similar to: <pre class="prettyprint"><code>ts = pd.DataFrame([['Jan 2000','WidgetCo',0.5, 2], ['Jan 2000','GadgetCo',0.3, 3], ['Jan 2000','SnazzyCo',0.2, 4], ['Feb 2000','WidgetCo',0.4, 2], ['Feb 2000','GadgetCo',0.5, 2.5], ['Feb 2000','SnazzyCo',0.1, 4], ], columns=['month','company','share','price']) </code></pre> Which looks like: <pre class="prettyprint"><code> month company share price 0 Jan 2000 WidgetCo 0.5 2.0 1 Jan 2000 GadgetCo 0.3 3.0 2 Jan 2000 SnazzyCo 0.2 4.0 3 Feb 2000 WidgetCo 0.4 2.0 4 Feb 2000 GadgetCo 0.5 2.5 5 Feb 2000 SnazzyCo 0.1 4.0 </code></pre> I can pivot this table like so: <pre class="prettyprint"><code>pd.pivot_table(ts,index='month', columns='company') </code></pre> Which gets me: <pre class="prettyprint"><code> share price company GadgetCo SnazzyCo WidgetCo GadgetCo SnazzyCo WidgetCo month Feb 2000 0.5 0.1 0.4 2.5 4 2 Jan 2000 0.3 0.2 0.5 3.0 4 2 </code></pre> This is what I want except that I need to collapse the <code>MultiIndex</code> so that the <code>company</code> is used as a prefix for <code>share</code> and <code>price</code> like so: <pre class="prettyprint"><code> WidgetCo_share WidgetCo_price GadgetCo_share GadgetCo_price ... month Jan 2000 0.5 2 0.3 3.0 Feb 2000 0.4 2 0.5 2.5 </code></pre> I came up with this function to do just that but it seems like a poor solution: <pre class="prettyprint"><code>def pivot_table_to_flat(df, column, index): res = df.set_index(index) cols = res.drop(column, axis=1).columns.values resulting_cols = [] for prefix in res[column].unique(): for col in cols: new_col_name = prefix + '_' + col res[new_col_name] = res[res[column] == prefix][col] resulting_cols.append(new_col_name) return res[resulting_cols] pivot_table_to_flat(ts, index='month', column='company') </code></pre> What is a better way of accomplishing a pivot resulting in a columns with prefixes as opposed to a <code>MultiIndex</code>?

This seems even simpler: <pre class="prettyprint"><code>df.columns = [' '.join(col).strip() for col in df.columns.values] </code></pre> It takes a <code>df</code> with a multiindex column and flattens the column labels, with the df remaining in place. (ref: @andy-haden Python Pandas - How to flatten a hierarchical index in columns )

pivoting pandas dataframe into prefixed cols, not a MultiIndex

Tags:

python

pandas

I have a timeseries dataframe that is similar to:

ts = pd.DataFrame([['Jan 2000','WidgetCo',0.5, 2], ['Jan 2000','GadgetCo',0.3, 3], ['Jan 2000','SnazzyCo',0.2, 4],
          ['Feb 2000','WidgetCo',0.4, 2], ['Feb 2000','GadgetCo',0.5, 2.5], ['Feb 2000','SnazzyCo',0.1, 4],
          ], columns=['month','company','share','price'])

Which looks like:

  month   company  share  price
0  Jan 2000  WidgetCo    0.5    2.0
1  Jan 2000  GadgetCo    0.3    3.0
2  Jan 2000  SnazzyCo    0.2    4.0
3  Feb 2000  WidgetCo    0.4    2.0
4  Feb 2000  GadgetCo    0.5    2.5
5  Feb 2000  SnazzyCo    0.1    4.0

I can pivot this table like so:

pd.pivot_table(ts,index='month', columns='company')

Which gets me:

            share                      price                  
company  GadgetCo SnazzyCo WidgetCo GadgetCo SnazzyCo WidgetCo
month                                                         
Feb 2000      0.5      0.1      0.4      2.5        4        2
Jan 2000      0.3      0.2      0.5      3.0        4        2

This is what I want except that I need to collapse the MultiIndex so that the company is used as a prefix for share and price like so:

          WidgetCo_share  WidgetCo_price  GadgetCo_share  GadgetCo_price   ...
month                                                                      
Jan 2000             0.5               2             0.3             3.0   
Feb 2000             0.4               2             0.5             2.5

I came up with this function to do just that but it seems like a poor solution:

def pivot_table_to_flat(df, column, index):
    res = df.set_index(index)
    cols = res.drop(column, axis=1).columns.values
    resulting_cols = []
    for prefix in res[column].unique():
        for col in cols:
            new_col_name = prefix + '_' + col
            res[new_col_name] = res[res[column] == prefix][col]
            resulting_cols.append(new_col_name)

    return res[resulting_cols]

pivot_table_to_flat(ts, index='month', column='company')

What is a better way of accomplishing a pivot resulting in a columns with prefixes as opposed to a MultiIndex?

980

asked Nov 21 '14 22:11

Ben Mabey

1 Answers

This seems even simpler:

df.columns = [' '.join(col).strip() for col in df.columns.values]

It takes a df with a multiindex column and flattens the column labels, with the df remaining in place.

(ref: @andy-haden Python Pandas - How to flatten a hierarchical index in columns )

answered Nov 03 '22 04:11

CPBL

Related questions
                            
                                Python format print with a list
                            
                                How to run project files using Anaconda from any directory in Windows
                            
                                how to combine 2 lists uniquely
                            
                                Performing grid search on sklearn.naive_bayes.MultinomialNB on multi-core machine doesn’t use all the available CPU resources
                            
                                Why I got tornado.autoreload started more than once in testing?
                            
                                How to make use of the filesystem cache in Java or Python?
                            
                                psutil.process_iter() doesn't return all running processes
                            
                                Lost in pudb command line area
                            
                                Color points according to their contour color
                            
                                Finding the recurring pattern
                            
                                import image to python as 2D array
                            
                                Dump function variables to workspace in python/ipython
                            
                                Sending specific hex data using scapy
                            
                                Python 3 email body encoding
                            
                                Adding an alpha channel to a Monochrome Image using Open CV Python
                            
                                How to install aggdraw with Python 2.7
                            
                                Append to Series in python/pandas not working
                            
                                Generate bigrams with NLTK
                            
                                Python running out of memory parsing XML using cElementTree.iterparse
                            
                                Python: Convert string (in scientific notation) to float

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With