Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the type of loc and iloc? (brackets vs parentheses)

Tags:

python

pandas

I'm from C++ background and started learning python recently. I was studying about indexing and selecting data. I came across .iloc[] in the class Series, DataFrame and Panel in the pandas library. I couldn't understand what is .iloc? Is it function or an attribute ? Many times I mistakenly use () instead of [] and don't get the actual result (but it doesn't throw me an error).

Example:

In [43]: s = pd.Series(np.arange(5), index=np.arange(5)[::-1], dtype='int64')

In [44]: s[s.index.isin([2, 4, 6])]
Out[44]: 
4    0
2    2
dtype: int64

In [45]: s.iloc(s.index.isin([2,4,6]))
Out[45]: <pandas.core.indexing._iLocIndexer at 0x7f1e68d53978>

In [46]: s.iloc[s.index.isin([2,4,6])]
Out[46]: 
4    0
2    2
dtype: int64

Could anyone tell me reference where to study much more about such type of operators.

like image 566
Vedanshu Avatar asked Jun 21 '17 11:06

Vedanshu


People also ask

What is the difference between ILOC () and loc ()?

The main distinction between the two methods is: loc gets rows (and/or columns) with particular labels. iloc gets rows (and/or columns) at integer locations.

What is loc and ILOC?

Difference between loc() and iloc() in Pandas DataFrame loc() and iloc() are one of those methods. These are used in slicing data from the Pandas DataFrame. They help in the convenient selection of data from the DataFrame in Python. They are used in filtering the data according to some conditions.

Why are loc square brackets?

loc is an instance of a _LocIndexer class. The syntax loc[] derives from the fact that _LocIndexer defines __getitem__ and __setitem__ *, which are the methods python calls whenever you use the square brackets syntax.

Should I use loc or ILOC?

loc is used to index a pandas DataFrame or Series using labels. On the other hand, iloc can be used to retrieve records based on their positional index.


1 Answers

The practical answer: You should think of iloc and loc as pandas extensions of the python list and dictionary respectively and treat them as lookups rather than function or method calls. Thus, keeping with python syntax, always use [] rather than ().

>>> ser = pd.Series( { 'a':3, 'c':9 } )

>>> ser.loc['a']    # pandas dictionary syntax (label-based)
3
>>> ser.iloc[0]     # pandas list/array syntax (location-based)
3

It's basically the same for dataframes, just with an extra dimension to specify, and that's also where iloc and loc become more useful, but that's getting beyond the scope of this question.

The deeper answer: If you are really trying to understand this at a deeper level, you need to understand __getitem__. You could perhaps start here for some basics. The answers in the second link provided in the comments above by @ayhan are also excellent and quite relevant to your question.

like image 140
JohnE Avatar answered Nov 12 '22 12:11

JohnE