I have read this documentation:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.loc.html
You can use a syntax like df.loc[df['shield'] > 6, ['max_speed']].
I tried using Github and found out:
Suppose you have a pandas.core.frame.DataFrame object, i.e. a DataFrame called df.
The type of df.loc is pandas.core.indexing._LocIndexer.
Nevertheless, I could not sort out these questions:
How do you make a Python function/class accepting a syntax like above?
Where in the source code of pandas.core.frame.DataFrame is the property self.loc defined??
How you make a class accept that syntax in general is by implementing __getitem__ which is an example of operator overloading. This allows an object of that class to be indexed with []. For example:
class get_item_example(object):
def __getitem__(self, key):
print(key)
Try it out:
>>> gi = get_item_example()
>>> gi['a']
a
>>> gi[['a','b','c']]
['a', 'b', 'c']
>>> gi['a','b','c']
('a', 'b', 'c')
In the case of df.loc[df['shield'] > 6, ['max_speed']] what happens is that the key passed to __getitem__ is a tuple containing the pandas series returned by df['shield'] > 6 and the single item list ['max_speed'].
In the pandas source, pandas.core.indexing._LocIndexer inherits an implementation of __getitem__ from pandas.core.indexing. _LocationIndexer. The implementation is here: https://github.com/pandas-dev/pandas/blob/61362be9ea4d69b33ae421f1f98b8db50be611a2/pandas/core/indexing.py#L1374
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With