When using the 'in' operator to search for an item in a list e.g.
if item in list:
print item
What algorithm is used to search for this item. Is it a straight search of the list from beginning to end or does it use something like binary search?
Python's in operator lets you loop through all the members of a collection(such as a list or a tuple) and check if there's a member in the list that's equal to the given item.
To find an element in the list, use the Python list index() method, The index() is an inbuilt Python method that searches for an item in the list and returns its index. The index() method finds the given element in the list and returns its position.
Searching is a very basic necessity when you store data in different data structures. The simplest approach is to go across every element in the data structure and match it with the value you are searching for. This is known as Linear search.
list
s can't be assumed to be in sorted order (or any order at all), so binary search won't work. Nor can the keys be assumed to be hashable, so unlike a dict
or set
a hash-table lookup can't be used to accelerate the search
At a guess it's a straight-through check of every element from first to last.
I'll try and dig up the relevant Python source code.
--
Edit: The Python list.__contains__()
function, which implements the in
operator, is defined in listobject.c:
393 static int
394 list_contains(PyListObject *a, PyObject *el)
395 {
396 Py_ssize_t i;
397 int cmp;
398
399 for (i = 0, cmp = 0 ; cmp == 0 && i < Py_SIZE(a); ++i)
400 cmp = PyObject_RichCompareBool(el, PyList_GET_ITEM(a, i),
401 Py_EQ);
402 return cmp;
403 }
It iterates over every element in the list, from the first element to the last element (or until it finds a match.) No shortcuts here.
--
Edit 2: The plot thickens. If Python detects that you're testing for membership of an element in a constant list
or set
, like:
if letter in ['a','e','i','o','u']: # list version
if letter in {'a','e','i','o','u'}: # set version
Edit 3 [@JohnMachin]:
The constant list is optimised to a constant tuple in 2.5-2.7 and 3.1-3.3.
The constant set is optimised to a (constant) frozenset in 3.3.
See also @CoryCarson's answer.
If list
is a literal list, Python 3.2+ will take a faster approach: http://docs.python.org/dev/whatsnew/3.2.html#optimizations
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With