I have a dataframe:
pe_odds[ [ 'EVENT_ID', 'SELECTION_ID', 'ODDS' ] ] Out[67]: EVENT_ID SELECTION_ID ODDS 0 100429300 5297529 18.00 1 100429300 5297529 20.00 2 100429300 5297529 21.00 3 100429300 5297529 22.00 4 100429300 5297529 23.00 5 100429300 5297529 24.00 6 100429300 5297529 25.00
When I use groupby and agg, I get results with a multi-index:
pe_odds.groupby( [ 'EVENT_ID', 'SELECTION_ID' ] )[ 'ODDS' ].agg( [ np.min, np.max ] ) Out[68]: amin amax EVENT_ID SELECTION_ID 100428417 5490293 1.71 1.71 5881623 1.14 1.35 5922296 2.00 2.00 5956692 2.00 2.02 100428419 603721 2.44 2.90 4387436 4.30 6.20 4398859 1.23 1.35 4574687 1.35 1.46 4881396 14.50 19.00 6032606 2.94 4.20 6065580 2.70 5.80 6065582 2.42 3.65 100428421 5911426 2.22 2.52
I have tried using as_index to return the results without the multi_index:
pe_odds.groupby( [ 'EVENT_ID', 'SELECTION_ID' ], as_index=False )[ 'ODDS' ].agg( [ np.min, np.max ], as_index=False )
But it still gives me a multi-index.
I can use .reset_index(), but it is very slow:
pe_odds.groupby( [ 'EVENT_ID', 'SELECTION_ID' ] )[ 'ODDS' ].agg( [ np.min, np.max ] ).reset_index() pe_odds.groupby( [ 'EVENT_ID', 'SELECTION_ID' ] )[ 'ODDS' ].agg( [ np.min, np.max ] ).reset_index() Out[69]: EVENT_ID SELECTION_ID amin amax 0 100428417 5490293 1.71 1.71 1 100428417 5881623 1.14 1.35 2 100428417 5922296 2.00 2.00 3 100428417 5956692 2.00 2.02 4 100428419 603721 2.44 2.90 5 100428419 4387436 4.30 6.20
How can I return the results, without the Multi-index, using parameters of the groupby and/or agg function. And without having to resort to using reset_index() ?
Python – Drop multiple levels from a multi-level column index in Pandas dataframe. To drop multiple levels from a multi-level column index, use the columns. droplevel() repeatedly.
The Groupby Rolling function does not preserve the original index and so when dates are the same within the Group, it is impossible to know which index value it pertains to from the original dataframe.
Below call:
>>> gr = df.groupby(['EVENT_ID', 'SELECTION_ID'], as_index=False) >>> res = gr.agg({'ODDS':[np.min, np.max]}) >>> res EVENT_ID SELECTION_ID ODDS amin amax 0 100429300 5297529 18 25 1 100429300 5297559 30 38
returns a frame with mulit-index columns. If you do not want columns to be multi-index either you may do:
>>> res.columns = list(map(''.join, res.columns.values)) >>> res EVENT_ID SELECTION_ID ODDSamin ODDSamax 0 100429300 5297529 18 25 1 100429300 5297559 30 38
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With