I have a Pandas DataFrame
using a MultiIndex
on the rows:
index = pandas.MultiIndex.from_tuples(list(itertools.product(range(3), range(3))))
df = pandas.DataFrame(numpy.random.randn(9,3), index=index, columns=['A', 'B', 'C'])
A B C
0 0 2.400417 0.698638 1.231540
1 -0.023154 -2.110450 0.774964
2 -1.282392 -0.062794 1.471655
1 0 -1.081853 0.261876 -1.771075
1 -2.013747 -0.377957 -0.393802
2 1.711172 -0.552468 1.018727
2 0 0.155821 -0.222691 0.496586
1 0.563638 -0.756709 1.050212
2 -1.446159 -0.891549 0.256695
I would like to shuffle this DataFrame on the first level of the index, so a possible result would be:
A B C
1 0 -1.081853 0.261876 -1.771075
1 -2.013747 -0.377957 -0.393802
2 1.711172 -0.552468 1.018727
0 0 2.400417 0.698638 1.231540
1 -0.023154 -2.110450 0.774964
2 -1.282392 -0.062794 1.471655
2 0 0.155821 -0.222691 0.496586
1 0.563638 -0.756709 1.050212
2 -1.446159 -0.891549 0.256695
The reindex
method can accomplish this when passed a reordered array of tuples matching the desired order. At which point, reordering can be done as best fits your problem. For example:
In [38]: df
Out[38]:
A B C
0 0 -1.725337 0.111493 0.178294
1 -1.809003 -0.614219 -0.931909
2 0.621427 -0.186233 0.254727
1 0 -1.322863 1.242415 1.375579
1 0.249738 -1.280204 0.356491
2 -0.743671 0.325841 -0.167772
2 0 -0.070937 0.401172 -1.790801
1 1.433794 2.257198 1.848435
2 -1.021557 -1.054363 -1.485536
In [39]: neworder = [1, 0, 2]
In [41]: newindex = sorted(df.index, key=lambda x: neworder.index(x[0]))
In [42]: newindex
Out[42]:
[(1L, 0L),
(1L, 1L),
(1L, 2L),
(0L, 0L),
(0L, 1L),
(0L, 2L),
(2L, 0L),
(2L, 1L),
(2L, 2L)]
In [43]: df.reindex(newindex)
Out[43]:
A B C
1 0 -1.322863 1.242415 1.375579
1 0.249738 -1.280204 0.356491
2 -0.743671 0.325841 -0.167772
0 0 -1.725337 0.111493 0.178294
1 -1.809003 -0.614219 -0.931909
2 0.621427 -0.186233 0.254727
2 0 -0.070937 0.401172 -1.790801
1 1.433794 2.257198 1.848435
2 -1.021557 -1.054363 -1.485536
A simple, concise solution is
icol = df.index.levels[k]
df.loc[icol[permutation]]
which works for the outermost level. To get a general level, either use swaplevel
or IndexSlice
Full MWE:
import pandas
from io import StringIO
table = """
i j A B C
0 0 2.400417 0.698638 1.231540
1 -0.023154 -2.110450 0.774964
2 -1.282392 -0.062794 1.471655
1 0 -1.081853 0.261876 -1.771075
1 -2.013747 -0.377957 -0.393802
2 1.711172 -0.552468 1.018727
2 0 0.155821 -0.222691 0.496586
1 0.563638 -0.756709 1.050212
2 -1.446159 -0.891549 0.256695
"""
df = pandas.read_fwf(StringIO(table), dtype={"i":"UInt8", "j":"UInt8"})
df = df.ffill().set_index(["i", "j"])
i, j = df.index.levels
idx = pandas.IndexSlice
perm = [2,0,1]
df.loc[i[perm]]
Returns
A B C
i j
2 0 0.155821 -0.222691 0.496586
1 0.563638 -0.756709 1.050212
2 -1.446159 -0.891549 0.256695
0 0 2.400417 0.698638 1.231540
1 -0.023154 -2.110450 0.774964
2 -1.282392 -0.062794 1.471655
1 0 -1.081853 0.261876 -1.771075
1 -2.013747 -0.377957 -0.393802
2 1.711172 -0.552468 1.018727
And df.loc[idx[i, j[perm]], :]
gives
A B C
i j
0 2 -1.282392 -0.062794 1.471655
0 2.400417 0.698638 1.231540
1 -0.023154 -2.110450 0.774964
1 2 1.711172 -0.552468 1.018727
0 -1.081853 0.261876 -1.771075
1 -2.013747 -0.377957 -0.393802
2 2 -1.446159 -0.891549 0.256695
0 0.155821 -0.222691 0.496586
1 0.563638 -0.756709 1.050212
But be careful! Although df.loc[idx[i[perm], :], :]
and df.loc[idx[i[[perm]], j], :]
give the same result, the above df.loc[idx[i, j[perm]], :]
returns something different than the (maybe) unexpected result from df.loc[idx[:, j[perm]], :]
:
A B C
i j
0 2 -1.282392 -0.062794 1.471655
1 2 1.711172 -0.552468 1.018727
2 2 -1.446159 -0.891549 0.256695
0 0 2.400417 0.698638 1.231540
1 0 -1.081853 0.261876 -1.771075
2 0 0.155821 -0.222691 0.496586
0 1 -0.023154 -2.110450 0.774964
1 1 -2.013747 -0.377957 -0.393802
2 1 0.563638 -0.756709 1.050212
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With