A similar question has already been ask Cost of len() function here. However, this question looks at the cost of <code>len</code> it self. Suppose, I have a code that repeats many times <code>len(List)</code>, every time is <code>O(1)</code>, reading a variable is also <code>O(1)</code> plus assigning it is also <code>O(1)</code>. As a side note, I find that <code>n_files = len(Files)</code> is somewhat more readable than repeated <code>len(Files)</code> in my code. So, that is already an incentive for me to do this. You could also argue against me, that somewhere in the code <code>Files</code> can be modified, so <code>n_files</code> is no longer correct, but that is not the case. My question is: Is the a number of calls to <code>len(Files)</code> after which accessing <code>n_files</code> will be faster?

A few results (time, in seconds, for one million calls), with a ten-element list using Python 2.7.10 on Windows 7; <code>store</code> is whether we store the length or keeping calling <code>len</code>, and <code>alias</code> is whether or not we create a local alias for <code>len</code>: <pre class="prettyprint"><code>Store Alias n= 1 10 100 Yes Yes 0.862 1.379 6.669 Yes No 0.792 1.337 6.543 No Yes 0.914 1.924 11.616 No No 0.879 1.987 12.617 </code></pre> and a thousand-element list: <pre class="prettyprint"><code>Store Alias n= 1 10 100 Yes Yes 0.877 1.369 6.661 Yes No 0.785 1.299 6.808 No Yes 0.926 1.886 11.720 No No 0.891 1.948 12.843 </code></pre> Conclusions: <ul> <li>Storing the result is more efficient than calling <code>len</code> repeatedly, even for <code>n == 1</code>;</li> <li>Creating a local alias for <code>len</code> can make a small improvement for larger <code>n</code> where we aren't storing the result, but not as much as just storing the result would; and</li> <li>The influence of the length of the list is negligible, suggesting that whether or not the integers are interned isn't making any difference.</li> </ul> Test script: <pre class="prettyprint"><code>def test(n, l, store, alias): if alias: len_ = len len_l = len_(l) else: len_l = len(l) for _ in range(n): if store: _ = len_l elif alias: _ = len_(l) else: _ = len(l) if __name__ == '__main__': from itertools import product from timeit import timeit setup = 'from __main__ import test, l' for n, l, store, alias in product( (1, 10, 100), ([None]*10,), (True, False), (True, False), ): test_case = 'test({!r}, l, {!r}, {!r})'.format(n, store, alias) print test_case, len(l), print timeit(test_case, setup=setup) </code></pre>

Function calls in python are costly, so if you are 100% sure that the size of <code>n_files</code> would not change when you are accessing its length from the variable, you can use the variable, if that is what is more readable for you as well. An Example performance test for both accessing <code>len(list)</code> and accessing from variable , gives the following result - <pre class="prettyprint"><code>In [36]: l = list(range(100000)) In [37]: n_l = len(l) In [40]: %timeit newn = len(l) 10000000 loops, best of 3: 92.8 ns per loop In [41]: %timeit new_n = n_l 10000000 loops, best of 3: 33.1 ns per loop </code></pre> Accessing the variable is always faster than using <code>len()</code> .

perfomance of len(List) vs reading a variable

Tags:

performance

python

A similar question has already been ask Cost of len() function here. However, this question looks at the cost of len it self. Suppose, I have a code that repeats many times len(List), every time is O(1), reading a variable is also O(1) plus assigning it is also O(1).

As a side note, I find that n_files = len(Files) is somewhat more readable than repeated len(Files) in my code. So, that is already an incentive for me to do this. You could also argue against me, that somewhere in the code Files can be modified, so n_files is no longer correct, but that is not the case.

My question is:
Is the a number of calls to len(Files) after which accessing n_files will be faster?

630

asked Jul 22 '15 10:07

oz123

2 Answers

A few results (time, in seconds, for one million calls), with a ten-element list using Python 2.7.10 on Windows 7; store is whether we store the length or keeping calling len, and alias is whether or not we create a local alias for len:

Store Alias n=      1      10     100
Yes   Yes       0.862   1.379   6.669
Yes   No        0.792   1.337   6.543
No    Yes       0.914   1.924  11.616
No    No        0.879   1.987  12.617

and a thousand-element list:

Store Alias n=      1      10     100
Yes   Yes       0.877   1.369   6.661
Yes   No        0.785   1.299   6.808
No    Yes       0.926   1.886  11.720
No    No        0.891   1.948  12.843

Conclusions:

Storing the result is more efficient than calling len repeatedly, even for n == 1;
Creating a local alias for len can make a small improvement for larger n where we aren't storing the result, but not as much as just storing the result would; and
The influence of the length of the list is negligible, suggesting that whether or not the integers are interned isn't making any difference.

Test script:

def test(n, l, store, alias):
    if alias:
        len_ = len
        len_l = len_(l)
    else:
        len_l = len(l)
    for _ in range(n):
        if store:
            _ = len_l
        elif alias:
            _ = len_(l)
        else:
            _ = len(l)

if __name__ == '__main__':
    from itertools import product
    from timeit import timeit
    setup = 'from __main__ import test, l'
    for n, l, store, alias in product(
        (1, 10, 100),
        ([None]*10,),
        (True, False),
        (True, False),
    ):
        test_case = 'test({!r}, l, {!r}, {!r})'.format(n, store, alias)
        print test_case, len(l),
        print timeit(test_case, setup=setup)

192

answered Nov 11 '22 15:11

jonrsharpe

Function calls in python are costly, so if you are 100% sure that the size of n_files would not change when you are accessing its length from the variable, you can use the variable, if that is what is more readable for you as well.

An Example performance test for both accessing len(list) and accessing from variable , gives the following result -

In [36]: l = list(range(100000))

In [37]: n_l = len(l)

In [40]: %timeit newn = len(l)
10000000 loops, best of 3: 92.8 ns per loop

In [41]: %timeit new_n = n_l
10000000 loops, best of 3: 33.1 ns per loop

Accessing the variable is always faster than using len() .

answered Nov 11 '22 14:11

Anand S Kumar

Related questions
                            
                                Django collecstatic boto broken pipe on large file upload
                            
                                Why does from __future__ import * raise an error?
                            
                                How to find error on slope and intercept using numpy.polyfit
                            
                                Best way to store python datetime.time in a sqlite3 column?
                            
                                Passing array range as argument to a function?
                            
                                Combining two numpy arrays to form an array with the largest value from each array
                            
                                How to add a prefix to an existing python logging formatter
                            
                                UndefinedVariableError when querying pandas DataFrame
                            
                                How to download files with Box API & Python
                            
                                Copy a generator
                            
                                100% area plot of a pandas DataFrame
                            
                                Slicing a list and group
                            
                                Upgrade permission denied: how to upgrade pip on Mac OS X? [closed]
                            
                                Expanding a block of numbers in Python
                            
                                Stereo to Mono wav in Python
                            
                                multiprocessing.Pool.imap_unordered with fixed queue size or buffer?
                            
                                Django runserver from Python script
                            
                                string formatting a sql query in sqlite3
                            
                                Slow ANTLR4 generated Parser in Python, but fast in Java
                            
                                Adjust exponent text after setting scientific limits on matplotlib axis

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With