I had following data frame (the real data frame is much more larger than this one ) :
sale_user_id sale_product_id count 1 1 1 1 8 1 1 52 1 1 312 5 1 315 1
Then reshaped it to move the values in sale_product_id as column headers using the following code:
reshaped_df=id_product_count.pivot(index='sale_user_id',columns='sale_product_id',values='count')
and the resulting data frame is:
sale_product_id -1057 1 2 3 4 5 6 8 9 10 ... 98 980 981 982 983 984 985 986 987 99 sale_user_id 1 NaN 1.0 NaN NaN NaN NaN NaN 1.0 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 3 NaN 1.0 NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 4 NaN NaN 1.0 NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
as you can see we have a multililevel index , what i need is to have sale_user_is in the first column without multilevel indexing:
i take the following approach :
reshaped_df.reset_index()
the the result would be like this i still have the sale_product_id column , but i do not need it anymore:
sale_product_id sale_user_id -1057 1 2 3 4 5 6 8 9 ... 98 980 981 982 983 984 985 986 987 99 0 1 NaN 1.0 NaN NaN NaN NaN NaN 1.0 NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 1 3 NaN 1.0 NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 2 4 NaN NaN 1.0 NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN
i can subset this data frame to get rid of sale_product_id
but i don't think it would be efficient.I am looking for an efficient way to get rid of multilevel indexing while reshaping the original data frame
To drop multiple levels from a multi-level column index, use the columns. droplevel() repeatedly.
The most straightforward way to drop a Pandas dataframe index is to use the Pandas . reset_index() method. By default, the method will only reset the index, forcing values from 0 - len(df)-1 as the index. The method will also simply insert the dataframe index into a column in the dataframe.
You need remove only index name
, use rename_axis
(new in pandas
0.18.0
):
print (reshaped_df) sale_product_id 1 8 52 312 315 sale_user_id 1 1 1 1 5 1 print (reshaped_df.index.name) sale_user_id print (reshaped_df.rename_axis(None)) sale_product_id 1 8 52 312 315 1 1 1 1 5 1
Another solution working in pandas below 0.18.0
:
reshaped_df.index.name = None print (reshaped_df) sale_product_id 1 8 52 312 315 1 1 1 1 5 1
If need remove columns name
also:
print (reshaped_df.columns.name) sale_product_id print (reshaped_df.rename_axis(None).rename_axis(None, axis=1)) 1 8 52 312 315 1 1 1 1 5 1
Another solution:
reshaped_df.columns.name = None reshaped_df.index.name = None print (reshaped_df) 1 8 52 312 315 1 1 1 1 5 1
EDIT by comment:
You need reset_index
with parameter drop=True
:
reshaped_df = reshaped_df.reset_index(drop=True) print (reshaped_df) sale_product_id 1 8 52 312 315 0 1 1 1 5 1 #if need reset index nad remove column name reshaped_df = reshaped_df.reset_index(drop=True).rename_axis(None, axis=1) print (reshaped_df) 1 8 52 312 315 0 1 1 1 5 1
Of if need remove only column name:
reshaped_df = reshaped_df.rename_axis(None, axis=1) print (reshaped_df) 1 8 52 312 315 sale_user_id 1 1 1 1 5 1
Edit1:
So if need create new column from index
and remove columns names
:
reshaped_df = reshaped_df.rename_axis(None, axis=1).reset_index() print (reshaped_df) sale_user_id 1 8 52 312 315 0 1 1 1 1 5 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With