Given this dataframe:
In [1]: df = pd.DataFrame(np.random.rand(4,4),
index=['A','B','C','All'],
columns=[2011,2012,2013,'All']).round(2)
print(df)
Out[1]:
2011 2012 2013 All
A 0.94 0.17 0.06 0.64
B 0.49 0.16 0.43 0.64
C 0.16 0.20 0.22 0.37
All 0.94 0.04 0.72 0.18
I'm trying to using pd.style
to format the output of a dataframe. One keyword is subset
where you define where to apply your formatting rules (for example: highlight max). The documentation for pd.style hints that it's better to use pd.IndexSlice
for this:
The value passed to
subset
behaves simlar to slicing a DataFrame.
- A scalar is treated as a column label
- A list (or series or numpy array)
- A tuple is treated as (row_indexer, column_indexer)
Consider using
pd.IndexSlice
to construct the tuple for the last one.
I'm trying to understand why it's failing in some cases.
Let's say I want to to apply a bar to all rows but the first and last, and all columns but the last.
This IndexSlice
works:
In [2]: df.ix[pd.IndexSlice[1:-1,:-1]]
Out[2]:
2011 2012 2013
B 0.49 0.16 0.43
C 0.16 0.20 0.22
But when passed to style.bar
, it doesn't:
In [3]: df.style.bar(subset=pd.IndexSlice[1:-1,:-1], color='#d65f5f')
TypeError: cannot do slice indexing on <class 'pandas.indexes.base.Index'>
with these indexers [1] of <class 'int'>
Whereas if I pass it slightly differently, it works:
In [4]: df.style.bar(subset=pd.IndexSlice[df.index[1:-1],df.columns[:-1]],
color='#d65f5f')
I'm confused why this doesn't work. There seems to be a bit of lack of documentation regarding pd.IndexSlice
so maybe I'm missing something. It could also be a bug in pd.style
(which is fairly new, since 0.17.1
only).
Can someone explain what is wrong?
It's too bad this compatibility issue exists. From what I can tell though, you answer your own question. From you doc's you included the line:
A tuple is treated as (row_indexer, column_indexer)
This is not what we get with the first slice:
In [1]: pd.IndexSlice[1:-1,:-1]
Out[2]: (slice(1, -1, None), slice(None, -1, None))
but we do get something of that form from the second slice method:
In [3]: pd.IndexSlice[df.index[1:-1],df.columns[:-1]]
Out[4]: (Index(['B', 'C'], dtype='object'), Index([2011, 2012, 2013], dtype='object'))
I don't think that pd.IndexSlice
even does anything except wrap the contents in a tuple for this second case. You can just do this:
df.style.bar(subset=(df.index[1:-1],df.columns[:-1]),
color='#d65f5f')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With