I have read this documentation:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.loc.html
You can use a syntax like df.loc[df['shield'] > 6, ['max_speed']]
.
I tried using Github and found out:
Suppose you have a pandas.core.frame.DataFrame
object, i.e. a DataFrame
called df
.
The type of df.loc
is pandas.core.indexing._LocIndexer
.
Nevertheless, I could not sort out these questions:
How do you make a Python function/class accepting a syntax like above?
Where in the source code of pandas.core.frame.DataFrame
is the property self.loc
defined??
How you make a class accept that syntax in general is by implementing __getitem__
which is an example of operator overloading. This allows an object of that class to be indexed with []
. For example:
class get_item_example(object):
def __getitem__(self, key):
print(key)
Try it out:
>>> gi = get_item_example()
>>> gi['a']
a
>>> gi[['a','b','c']]
['a', 'b', 'c']
>>> gi['a','b','c']
('a', 'b', 'c')
In the case of df.loc[df['shield'] > 6, ['max_speed']]
what happens is that the key passed to __getitem__
is a tuple containing the pandas series returned by df['shield'] > 6
and the single item list ['max_speed']
.
In the pandas source, pandas.core.indexing._LocIndexer
inherits an implementation of __getitem__
from pandas.core.indexing. _LocationIndexer
. The implementation is here: https://github.com/pandas-dev/pandas/blob/61362be9ea4d69b33ae421f1f98b8db50be611a2/pandas/core/indexing.py#L1374
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With