Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it efficient to build a list with a generator function

When reading the book 'Effective Python' by Brett Slatkin I noticed that the author suggested that sometimes building a list using a generator function and calling list on the resulting iterator could lead to cleaner, more readable code.

So an example:

num_list = range(100)

def num_squared_iterator(nums):
    for i in nums:
        yield i**2

def get_num_squared_list(nums):
    l = []
    for i in nums:
        l.append(i**2)
    return l

Where a user could call

l = list(num_squared_iterator(num_list))

or

l = get_num_squared_list(nums)

and get the same result.

The suggestion was that the generator function has less noise because it is shorter and does not have the extra code for creating the list and appending values to it.

(NOTE clearly for these simple examples a list comprehension or generator expression would be better, but let us take it as given that this is a simplification of a pattern that can be used for more complex code that would not be clear in a list comprehension)

My question is this, is there a cost to wrapping the generator in a list? Would it be equivalent in performance to the list building function?

like image 793
Sam Redway Avatar asked Mar 02 '18 22:03

Sam Redway


1 Answers

Seeing this I decided to do a quick test and wrote and ran the following code:

from functools import wraps
from time import time

TEST_DATA = range(100)


def timeit(func):
    @wraps(func)
    def wrapped(*args, **kwargs):
        start = time()
        func(*args, **kwargs)
        end = time()
        print(f'running time for {func.__name__}  = {end-start}')
    return wrapped

def num_squared_iterator(nums):
    for i in nums:
        yield i**2

@timeit
def get_num_squared_list(nums):
    l = []
    for i in nums:
        l.append(i**2)
    return l

@timeit
def get_num_squared_list_from_iterator(nums):
    return list(num_squared_iterator(nums))


if __name__ == '__main__':
    get_num_squared_list(TEST_DATA)
    get_num_squared_list_from_iterator(TEST_DATA)

I ran the test code many times and each times (much to my surprise) the get_num_squared_list_from_iterator function actually ran (fractionally) faster than the get_num_squared_list function.

Here are results for my first few runs:

1. running time for get_num_squared_list = 5.2928924560546875e-05

running time for get_num_squared_list_from_iterator = 5.0067901611328125e-05

2. running time for get_num_squared_list = 5.3882598876953125e-05

running time for get_num_squared_list_from_iterator = 4.982948303222656e-05

3. running time for get_num_squared_list = 5.1975250244140625e-05

running time for get_num_squared_list_from_iterator = 4.76837158203125e-05

I am guessing that this is due to the expense of doing a list.append in each iteration of the loop in the get_num_squared_list function.

I find this interesting because not only is the code clear and elegant it seems more performant.

like image 101
Sam Redway Avatar answered Oct 13 '22 01:10

Sam Redway