I have a MultiIndex with some levels labeled with strings, and others with integers:
import pandas as pd
metrics = ['PT', 'TF', 'AF']
n_replicates = 3
n_nodes = 6
cols = [(r,m,n) for r in range(n_replicates) for m in metrics for n in range(n_nodes)]
cols = pd.MultiIndex.from_tuples(cols,names = ['Replicates', 'Metrics', 'Nodes'])
ind = range(5)
df = pd.DataFrame(columns=cols, index=ind)
df.sortlevel(level=0, axis=1, inplace=True)
If I want to select a single column with an integer label, no problem:
df[2,'AF',10]
If I try to select a range, though:
df[1:4,'AF',10]
TypeError:
(No message given)
If I leave out the last level, I get a different error:
df.sortlevel(level=0,axis=1,inplace=True)
df[1:4,'AF']
TypeError: unhashable type
I suspect I'm playing with fire when I'm using integers as column labels. Is the "safe" route to simply have them all as strings? Or are there other ways of indexing MuliIndex dataframes with integer labels?
Edit: It's now clear to me that I should be using .loc. Good. However, it's still not clear to me out to interact with the lower levels of the MultiIndex.
df.loc[:,:] #Good
df.loc[:,1:2] #Good
df.loc[:,[1:2, 'AF']]
SyntaxError: invalid syntax
df.loc[:,1:2].xs('AF', level='Metrics', axis=1) #Good
Is the last line just what I need to use? If so, fine. It's just sufficiently long that it makes me feel I'm ignorant of a better way. Thanks for the help!
To index a given axis of a MultiIndex, you need to use a tuple, not a list. You can't use the "1:2" syntax for a slice within a tuple, so need to use the slice(1,2) syntax. So, you can access the slice you're interested in with:
df.loc[:, (slice(1,2), 'AF')]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With