I have the following pandas (related to the example here: pandas: slice a MultiIndex by range of secondary index)
import numpy as np
import pandas as pd
variable = np.repeat(['a','b','c'], [5,5,5])
time = [0,1,5,10,20,0,1,5,10,20,0,1,5,10,20]
arra = [variable, time]
index=pd.MultiIndex.from_arrays(arra, names=("variable", "time"))
s = pd.Series(
np.random.randn(len(sequence)),
index=index
)
Output would be
# In [1]: s
variable time
a 0 -1.284692
1 -0.313895
5 -0.980222
10 -1.452306
20 -0.423921
b 0 0.248625
1 0.183721
5 -0.733377
10 1.562653
20 -1.092559
c 0 0.061172
1 0.133960
5 0.765271
10 -0.648834
20 0.147158
dtype: float64
If I slice here on both multiindex, it would work like this:
# In [2]: s.loc[("a",0),:]
variable time
a 0 1.583589
1 -1.081401
5 -0.497904
10 0.352880
20 -0.179062
dtype: float64
But how can I just slice on secondary index "time" at e.g. time=0 and get every row with first index? The following won't work:
# In [3]: s.loc[(0),:]
KeyError: 0
How would I do that here?
Use xs
with specify second level or loc
with :
for select all values of first level and 0
for select values of second level:
print (s.xs(0, level=1))
Or:
print (s.loc[:, 0])
a 0.376784
b -0.643836
c -0.440340
dtype: float64
If working with indices and column(s) of DataFrame use slicers:
idx = pd.IndexSlice
df = pd.concat([s,s * 10], axis=1, keys=['a','b'])
print (df)
a b
variable time
a 0 1.054582 10.545820
1 -1.716213 -17.162130
5 -0.187765 -1.877645
10 -0.419005 -4.190047
20 -0.772808 -7.728078
b 0 -0.022520 -0.225202
1 -0.638453 -6.384531
5 0.410156 4.101559
10 0.512189 5.121889
20 -1.241232 -12.412322
c 0 -0.134815 -1.348148
1 -1.007632 -10.076318
5 -0.859790 -8.597898
10 -0.623177 -6.231767
20 -0.635504 -6.355036
print (df.loc[idx[:, 0], 'a'])
variable time
a 0 1.054582
b 0 -0.022520
c 0 -0.134815
Name: a, dtype: float64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With