For example: <pre class="prettyprint"><code>'hello'.count('e') </code></pre> Is this O(n)? I'm guessing the way it works is it scans <code>'hello'</code> and increments a counter each time the letter <code>'e'</code> is seen. How can I know this without guessing? I tried reading the source code here, but got stuck upon finding this: <pre class="prettyprint"><code>def count(s, *args): """count(s, sub[, start[,end]]) -> int Return the number of occurrences of substring sub in string s[start:end]. Optional arguments start and end are interpreted as in slice notation. """ return s.count(*args) </code></pre> Where can I read about what's executed in <code>s.count(*args)</code>? edit: I understand what <code>*args</code> does in the context of Python functions.

Much of python's library code is written in C. The code you are looking for is here: http://svn.python.org/view/python/trunk/Objects/stringobject.c?view=markup <pre class="prettyprint"><code>static PyMethodDef string_methods[] = { // ... {"count", (PyCFunction)string_count, METH_VARARGS, count__doc__}, // ... {NULL, NULL} /* sentinel */ }; static PyObject * string_count(PyStringObject *self, PyObject *args) { ... } </code></pre>

What's the computational cost of count operation on strings Python?

Tags:

python

big-o

python-2.7

For example:

Click to copy

'hello'.count('e')

Is this O(n)? I'm guessing the way it works is it scans 'hello' and increments a counter each time the letter 'e' is seen. How can I know this without guessing? I tried reading the source code here, but got stuck upon finding this:

Click to copy

def count(s, *args):
    """count(s, sub[, start[,end]]) -> int

    Return the number of occurrences of substring sub in string
    s[start:end].  Optional arguments start and end are
    interpreted as in slice notation.

    """
    return s.count(*args)

Where can I read about what's executed in s.count(*args)?

edit: I understand what *args does in the context of Python functions.

440

asked Mar 07 '16 22:03

nacho

2 Answers

str.count is implemented in native code, in the stringobject.c file, which delegates to either stringlib_count, or PyUnicode_Count which itself delegates to stringlib_count again. stringlib_count ultimately uses fastsearch to search for occurrences of the substring in the string and counting those.

For one-character strings (e.g. your 'e'), it is short-circuited to the following code path:

Click to copy

for (i = 0; i < n; i++)
    if (s[i] == p[0]) {
        count++;
        if (count == maxcount)
            return maxcount;
    }
return count;

So yes, this is exactly as you assumed a simple iteration over the string sequence and counting the occurences of the substring.

For search strings longer than a single character it gets a bit more complicated, due to handling overlaps etc., and the logic is buried deeper in the fastsearch implementation. But it’s essentially the same: a linear search through the string.

So yes, str.count is in linear time, O(n). And if you think about it, it makes a lot of sense: In order to know how often a substring appears in a string, you need to look at every possible substring of the same length. So for a substring length of 1, you have to look at every character in the string, giving you a linear complexity.

Btw. for more information about the underlying fastsearch algorithm, see this article on effbot.org.

For Python 3, which only has a single Unicode string type, the links to the implementations are: unicode_count which uses stringlib_count which uses fastsearch.

169

answered Jan 11 '23 04:01

poke

Much of python's library code is written in C. The code you are looking for is here:

http://svn.python.org/view/python/trunk/Objects/stringobject.c?view=markup

Click to copy

static PyMethodDef
string_methods[] = {
    // ...
    {"count", (PyCFunction)string_count, METH_VARARGS, count__doc__},
    // ...
    {NULL,     NULL}                         /* sentinel */
};

static PyObject *
string_count(PyStringObject *self, PyObject *args) {
    ...
}

answered Jan 11 '23 04:01

AJNeufeld

Related questions
                            
                                Proper way to mock classes and assert on calls to methods
                            
                                save base64 image python
                            
                                matplotlib graph shows only points instead of line
                            
                                creating dask dataframe by reading a pickle file in dask module of Python
                            
                                finding the last occurrence of an item in a list python
                            
                                How to unravel array?
                            
                                SerialException: could not open port (Access is denied)
                            
                                Django-Haystack giving attribute error?
                            
                                Urwid: make cursor invisible
                            
                                Error tokenizing data. C error: EOF following escape character
                            
                                Preprocess a Tensorflow tensor in Numpy
                            
                                Python unittest successfully asserts None is False
                            
                                Theano Dimshuffle equivalent in Google's TensorFlow?
                            
                                Why does print(0.3) print 0.3 and not 0.30000000000000004
                            
                                How to print with inline if statement?
                            
                                pandas: Is it possible to filter a dataframe with arbitrarily long boolean criteria?
                            
                                How to update an SVM model with new data
                            
                                How to transform a 3d arrays into a dataframe in python
                            
                                Does Python have an iterative recursion generator function for first-order recurrence relations?
                            
                                How to set environment variables of parent shell in Python? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What's the computational cost of count operation on strings Python?

Tags:

python

big-o

python-2.7

nacho

People also ask

2 Answers

poke

AJNeufeld

Recent Activity

Donate For Us