I have a timeseries dataframe that is similar to:
ts = pd.DataFrame([['Jan 2000','WidgetCo',0.5, 2], ['Jan 2000','GadgetCo',0.3, 3], ['Jan 2000','SnazzyCo',0.2, 4],
['Feb 2000','WidgetCo',0.4, 2], ['Feb 2000','GadgetCo',0.5, 2.5], ['Feb 2000','SnazzyCo',0.1, 4],
], columns=['month','company','share','price'])
Which looks like:
month company share price
0 Jan 2000 WidgetCo 0.5 2.0
1 Jan 2000 GadgetCo 0.3 3.0
2 Jan 2000 SnazzyCo 0.2 4.0
3 Feb 2000 WidgetCo 0.4 2.0
4 Feb 2000 GadgetCo 0.5 2.5
5 Feb 2000 SnazzyCo 0.1 4.0
I can pivot this table like so:
pd.pivot_table(ts,index='month', columns='company')
Which gets me:
share price
company GadgetCo SnazzyCo WidgetCo GadgetCo SnazzyCo WidgetCo
month
Feb 2000 0.5 0.1 0.4 2.5 4 2
Jan 2000 0.3 0.2 0.5 3.0 4 2
This is what I want except that I need to collapse the MultiIndex
so that the company
is used as a prefix for share
and price
like so:
WidgetCo_share WidgetCo_price GadgetCo_share GadgetCo_price ...
month
Jan 2000 0.5 2 0.3 3.0
Feb 2000 0.4 2 0.5 2.5
I came up with this function to do just that but it seems like a poor solution:
def pivot_table_to_flat(df, column, index):
res = df.set_index(index)
cols = res.drop(column, axis=1).columns.values
resulting_cols = []
for prefix in res[column].unique():
for col in cols:
new_col_name = prefix + '_' + col
res[new_col_name] = res[res[column] == prefix][col]
resulting_cols.append(new_col_name)
return res[resulting_cols]
pivot_table_to_flat(ts, index='month', column='company')
What is a better way of accomplishing a pivot resulting in a columns with prefixes as opposed to a MultiIndex
?
To revert the index of the dataframe from multi-index to a single index using the Pandas inbuilt function reset_index (). Syntax: DataFrame.reset_index (level=None, drop=False, inplace=False, col_level=0, col_fill=”) Returns: (Data Frame or None) DataFrame with the new index or None if inplace=True. Reverting the Multi-index using the above way i.
Multi-level columns are used when you wanted to group columns together. 1. Create MultiIndex pandas DataFrame (Multi level Index) A multi-level index DataFrame is a type of DataFrame that contains multiple level or hierarchical indexing. You can create a MultiIndex (multi-level index) in the following ways.
Pandas DataFrame: pivot () function Last update on May 27 2020 08:34:05 (UTC/GMT +8 hours) DataFrame - pivot () function The pivot () function is used to reshaped a given DataFrame organized by given index / column values.
When it comes to select data on a DataFrame, Pandas loc is one of the top favorites. In a previous article, we have introduced the loc and iloc for selecting data in a general (single-index) DataFrame. Accessing data in a MultiIndex DataFrame can be done in a similar way to a single index DataFrame. We can also use : to return all data.
This seems even simpler:
df.columns = [' '.join(col).strip() for col in df.columns.values]
It takes a df
with a multiindex column and flattens the column labels, with the df remaining in place.
(ref: @andy-haden Python Pandas - How to flatten a hierarchical index in columns )
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With