I have a DataFrame with MultiIndex looking like this after printing in the console:
value indA indB scenarioId group 2015-04-13 1 A -54.0 1.0 1.0 B -160.0 1.0 1.0 C -15.0 0.0 1.0 2 A -83.0 1.0 1.0 3 A -80.0 2.0 2.0 4 A -270.0 2.0 2.0 2015-04-14 1 A -56.0 1.0 1.0 B -1.0 1.0 1.0 C -60.0 0.0 1.0 2 A -32.0 1.0 1.0 3 A -91.0 2.0 2.0 4 A -17.0 2.0 2.0
I got it after I used the groupby
and sum
functions on my initial dataset.
I would like to keep the same format, but order it according to the value
column. I have tried hard to do it using the sorting functions, but I think that the fact of having the first index (for the dates) of the MultiIndex without name is a problem.
Essentially, the output should look like this:
value indA indB scenarioId group 2015-04-13 1 B -160.0 1.0 1.0 A -54.0 1.0 1.0 C -15.0 0.0 1.0 2 A -83.0 1.0 1.0 3 A -80.0 2.0 2.0 4 A -270.0 2.0 2.0 2015-04-14 1 C -60.0 1.0 1.0 A -56.0 1.0 1.0 B -1.0 0.0 1.0 2 A -32.0 1.0 1.0 3 A -91.0 2.0 2.0 4 A -17.0 2.0 2.0
Could someone enlighten me on this please?
Thanks in advance.
To sort the DataFrame based on the values in a single column, you'll use . sort_values() . By default, this will return a new DataFrame sorted in ascending order. It does not modify the original DataFrame.
You can sort pandas DataFrame by one or multiple (one or more) columns using sort_values() method and by ascending or descending order.
Pandas Series: sort_index() function The sort_index() function is used to sort Series by index labels. Returns a new Series sorted by label if inplace argument is False, otherwise updates the original series and returns None. Axis to direct sorting. This can only be 0 for Series.
You can use sort_values
+ sort_index
:
print (df.sort_values('value').sort_index(level=[0,1], sort_remaining=False))
value indA indB
scenarioId group
2015-04-13 1 B -160.0 1.0 1.0
A -54.0 1.0 1.0
C -15.0 0.0 1.0
2 A -83.0 1.0 1.0
3 A -80.0 2.0 2.0
4 A -270.0 2.0 2.0
2015-04-14 1 C -60.0 0.0 1.0
A -56.0 1.0 1.0
B -1.0 1.0 1.0
2 A -32.0 1.0 1.0
3 A -91.0 2.0 2.0
4 A -17.0 2.0 2.0
Another solution - sort_values
with reset_index
and set_index
:
df = df.reset_index()
.sort_values(['level_0','scenarioId','value'])
.set_index(['level_0','scenarioId','group'])
print (df)
value indA indB
level_0 scenarioId group
2015-04-13 1 B -160.0 1.0 1.0
A -54.0 1.0 1.0
C -15.0 0.0 1.0
2 A -83.0 1.0 1.0
3 A -80.0 2.0 2.0
4 A -270.0 2.0 2.0
2015-04-14 1 C -60.0 0.0 1.0
A -56.0 1.0 1.0
B -1.0 1.0 1.0
2 A -32.0 1.0 1.0
3 A -91.0 2.0 2.0
4 A -17.0 2.0 2.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With