I'm from C++ background and started learning python recently. I was studying about indexing and selecting data. I came across .iloc[]
in the class Series
, DataFrame
and Panel
in the pandas library. I couldn't understand what is .iloc
? Is it function or an attribute ? Many times I mistakenly use ()
instead of []
and don't get the actual result (but it doesn't throw me an error).
Example:
In [43]: s = pd.Series(np.arange(5), index=np.arange(5)[::-1], dtype='int64')
In [44]: s[s.index.isin([2, 4, 6])]
Out[44]:
4 0
2 2
dtype: int64
In [45]: s.iloc(s.index.isin([2,4,6]))
Out[45]: <pandas.core.indexing._iLocIndexer at 0x7f1e68d53978>
In [46]: s.iloc[s.index.isin([2,4,6])]
Out[46]:
4 0
2 2
dtype: int64
Could anyone tell me reference where to study much more about such type of operators.
The main distinction between the two methods is: loc gets rows (and/or columns) with particular labels. iloc gets rows (and/or columns) at integer locations.
Difference between loc() and iloc() in Pandas DataFrame loc() and iloc() are one of those methods. These are used in slicing data from the Pandas DataFrame. They help in the convenient selection of data from the DataFrame in Python. They are used in filtering the data according to some conditions.
loc is an instance of a _LocIndexer class. The syntax loc[] derives from the fact that _LocIndexer defines __getitem__ and __setitem__ *, which are the methods python calls whenever you use the square brackets syntax.
loc is used to index a pandas DataFrame or Series using labels. On the other hand, iloc can be used to retrieve records based on their positional index.
The practical answer: You should think of iloc
and loc
as pandas extensions of the python list and dictionary respectively and treat them as lookups rather than function or method calls. Thus, keeping with python syntax, always use []
rather than ()
.
>>> ser = pd.Series( { 'a':3, 'c':9 } )
>>> ser.loc['a'] # pandas dictionary syntax (label-based)
3
>>> ser.iloc[0] # pandas list/array syntax (location-based)
3
It's basically the same for dataframes, just with an extra dimension to specify, and that's also where iloc
and loc
become more useful, but that's getting beyond the scope of this question.
The deeper answer: If you are really trying to understand this at a deeper level, you need to understand __getitem__
. You could perhaps start here for some basics. The answers in the second link provided in the comments above by @ayhan are also excellent and quite relevant to your question.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With