In Python, it is known that in checks for membership in iterators (lists, dictionaries, etc) and looks for substrings in strings. My question is regarding how in is implemented to achieve all of the following: 1) test for membership, 2) test for substrings and 3) access to the next element in a for-loop. For example, when <code>for i in myList:</code> or <code>if i in myList:</code> is executed, does in call <code>myList.__next__()</code>? If it does call it, how then does it work with strings, given that str objects are not iterators(as checked in Python 2.7) and so do not have the next() method? If a detailed discussion of in's implementation is not possible, would appreciate if a gist of it is supplied here.

A class can define how the <code>in</code> operator works on instances of that class by defining a <code>__contains__</code> method. The Python data model documentation says: <blockquote> For objects that don’t define <code>__contains__()</code>, the membership test first tries iteration via <code>__iter__()</code>, then the old sequence iteration protocol via <code>__getitem__()</code>, see this section in the language reference. </blockquote> Section 6.10.2, "Membership test operations", of the Python language reference has this to say: <blockquote> The operators <code>in</code> and <code>not in</code> test for membership. <code>x in s</code> evaluates to <code>True</code> if x is a member of s, and <code>False</code> otherwise. <code>x not in s</code> returns the negation of <code>x in s</code>. All built-in sequences and set types support this as well as dictionary, for which <code>in</code> tests whether the dictionary has a given key. For container types such as list, tuple, set, frozenset, dict, or collections.deque, the expression <code>x in y</code> is equivalent to <code>any(x is e or x == e for e in y)</code>. For the string and bytes types, <code>x in y</code> is <code>True</code> if and only if x is a substring of y. An equivalent test is <code>y.find(x) != -1</code>. Empty strings are always considered to be a substring of any other string, so <code>"" in "abc"</code> will return <code>True</code>. For user-defined classes which define the <code>__contains__()</code> method, <code>x in y</code> returns <code>True</code> if <code>y.__contains__(x)</code> returns a true value, and <code>False</code> otherwise. For user-defined classes which do not define <code>__contains__()</code> but do define <code>__iter__()</code>, <code>x in y</code> is <code>True</code> if some value <code>z</code> with <code>x == z</code> is produced while iterating over <code>y</code>. If an exception is raised during the iteration, it is as if <code>in</code> raised that exception. Lastly, the old-style iteration protocol is tried: if a class defines <code>__getitem__()</code>, <code>x in y</code> is <code>True</code> if and only if there is a non-negative integer index i such that <code>x == y[i]</code>, and all lower integer indices do not raise <code>IndexError</code> exception. (If any other exception is raised, it is as if <code>in</code> raised that exception). The operator <code>not in</code> is defined to have the inverse true value of <code>in</code>. </blockquote> As a comment indicates above, the expression operator <code>in</code> is distinct from the keyword <code>in</code> which forms a part of the <code>for</code> statement. In the Python grammar, the <code>in</code> is "hardcoded" as a part of the syntax of <code>for</code>: <blockquote> <pre class="prettyprint"><code>for_stmt ::= "for" target_list "in" expression_list ":" suite ["else" ":" suite] </code></pre> </blockquote> So in the context of a <code>for</code> statement, <code>in</code> doesn't behave as an operator, it's simply a syntactic marker to separate the <code>target_list</code> from the <code>expression_list</code>.

In Python, how is the in operator implemented to work? Does it use the next() method of the iterators?

Tags:

python

in-operator

In Python, it is known that in checks for membership in iterators (lists, dictionaries, etc) and looks for substrings in strings. My question is regarding how in is implemented to achieve all of the following: 1) test for membership, 2) test for substrings and 3) access to the next element in a for-loop. For example, when for i in myList: or if i in myList: is executed, does in call myList.__next__()? If it does call it, how then does it work with strings, given that str objects are not iterators(as checked in Python 2.7) and so do not have the next() method? If a detailed discussion of in's implementation is not possible, would appreciate if a gist of it is supplied here.

540

asked Nov 29 '18 15:11

bp14

1 Answers

A class can define how the in operator works on instances of that class by defining a __contains__ method.

The Python data model documentation says:

For objects that don’t define __contains__(), the membership test first tries iteration via __iter__(), then the old sequence iteration protocol via __getitem__(), see this section in the language reference.

Section 6.10.2, "Membership test operations", of the Python language reference has this to say:

The operators in and not in test for membership. x in s evaluates to True if x is a member of s, and False otherwise. x not in s returns the negation of x in s. All built-in sequences and set types support this as well as dictionary, for which in tests whether the dictionary has a given key. For container types such as list, tuple, set, frozenset, dict, or collections.deque, the expression x in y is equivalent to any(x is e or x == e for e in y).

For the string and bytes types, x in y is True if and only if x is a substring of y. An equivalent test is y.find(x) != -1. Empty strings are always considered to be a substring of any other string, so "" in "abc" will return True.

For user-defined classes which define the __contains__() method, x in y returns True if y.__contains__(x) returns a true value, and False otherwise.

For user-defined classes which do not define __contains__() but do define __iter__(), x in y is True if some value z with x == z is produced while iterating over y. If an exception is raised during the iteration, it is as if in raised that exception.

Lastly, the old-style iteration protocol is tried: if a class defines __getitem__(), x in y is True if and only if there is a non-negative integer index i such that x == y[i], and all lower integer indices do not raise IndexError exception. (If any other exception is raised, it is as if in raised that exception).

The operator not in is defined to have the inverse true value of in.

As a comment indicates above, the expression operator in is distinct from the keyword in which forms a part of the for statement. In the Python grammar, the in is "hardcoded" as a part of the syntax of for:

for_stmt ::=  "for" target_list "in" expression_list ":" suite
              ["else" ":" suite]

So in the context of a for statement, in doesn't behave as an operator, it's simply a syntactic marker to separate the target_list from the expression_list.

132

answered Oct 06 '22 10:10

Daniel Pryden

Related questions
                            
                                Python unittest import problems
                            
                                Stop Jupyter notebook from generating new blank cells after every alt-enter (run)
                            
                                Converting Python 3 String of Bytes of Unicode - `str(utf8_encoded_str)` back to unicode
                            
                                multi line string formatting in python
                            
                                "Apps aren't loaded yet" when trying to run pytest-django
                            
                                Correct setup of django redis celery and celery beats
                            
                                Pandas, read CSV ignoring extra commas
                            
                                joblib.load __main__ AttributeError
                            
                                Pandas reverse of diff()
                            
                                Zero Padding a 3d Numpy array
                            
                                Plotting seaborn heatmap on top of a background picture
                            
                                User input in dialog box
                            
                                How to get the user's name in Telegram Bot?
                            
                                How to use spaCy to create a new entity and learn only from keyword list
                            
                                Python 3.6.x PyInstaller gives error "No module named 'PyQt5.sip'"
                            
                                AttributeError: module 'tensorflow' has no attribute 'name_scope' with Keras
                            
                                Django 2.0 url parameters in get_queryset
                            
                                How to retrieve well formatted JSON from AWS Lambda using Python
                            
                                Python 3 handling error TypeError: catching classes that do not inherit from BaseException is not allowed
                            
                                How can I make seaborn distribution subplots in a loop?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With