How can I safely select rows in pandas by a list of labels?
I want to get and Error when list contains any non-existing label.
Method loc doesn't raise a KeyError if at least 1 of the labels for which you ask is in the index. But this is not sufficient.
For example:
df = pd.DataFrame(index=list('abcde'), data={'A': np.arange(5) + 10})
df
A
a 10
b 11
c 12
d 13
e 14
# here I would like to get an Error as 'xx' and 'yy' are not in the index
df.loc[['b', 'xx', 'yy']]
A
b 11.0
xx NaN
yy NaN
Do pandas provide such a method that would raise a KeyError instead of returning me a bunch of NaNs for non-existing labels?
It's bit a hack, but one can do this like this:
def my_loc(df, idx):
assert len(df.index[df.index.isin(idx)]) == len(idx), 'KeyError:the labels [{}] are not in the [index]'.format(idx)
return df.loc[idx]
In [243]: my_loc(df, idx)
...
skipped
...
AssertionError: KeyError:the labels [['b', 'xx', 'yy']] are not in the [index]
In [245]: my_loc(df, ['a','c','e'])
Out[245]:
A
a 10
c 12
e 14
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With