Fastest way to convert an iterator to a list

People also ask

Are lists iterators Python?

A list is an iterable. But it is not an iterator. If we run the __iter__() method on our list, it will return an iterator.

What does ITER function do in Python?

The iter() function returns an iterator object.

Can we create an iterator object from a list?

Iterable is an object, which one can iterate over. It generates an Iterator when passed to the iter() method. Lists, tuples, dictionaries, strings and sets are all iterable objects. They are iterable containers that you can convert into an iterator.

list(your_iterator)

since python 3.5 you can use * iterable unpacking operator:

user_list = [*your_iterator]

but the pythonic way to do it is:

user_list  = list(your_iterator)

@Robino was suggesting to add some tests which make sense, so here is a simple benchmark between 3 possible ways (maybe the most used ones) to convert an iterator to a list:

by type constructor

list(my_iterator)

by unpacking

[*my_iterator]

using list comprehension

[e for e in my_iterator]

I have been using simple_bechmark library

from simple_benchmark import BenchmarkBuilder
from heapq import nsmallest

b = BenchmarkBuilder()

@b.add_function()
def convert_by_type_constructor(size):
    list(iter(range(size)))

@b.add_function()
def convert_by_list_comprehension(size):
    [e for e in iter(range(size))]

@b.add_function()
def convert_by_unpacking(size):
    [*iter(range(size))]


@b.add_arguments('Convert an iterator to a list')
def argument_provider():
    for exp in range(2, 22):
        size = 2**exp
        yield size, size

r = b.run()
r.plot()

enter image description here

As you can see there is very hard to make a difference between conversion by the constructor and conversion by unpacking, conversion by list comprehension is the “slowest” approach.

I have been testing also across different Python versions (3.6, 3.7, 3.8, 3.9) by using the following simple script:

import argparse
import timeit

parser = argparse.ArgumentParser(
    description='Test convert iterator to list')
parser.add_argument(
    '--size', help='The number of elements from iterator')

args = parser.parse_args()

size = int(args.size)
repeat_number = 10000

# do not wait too much if the size is too big
if size > 10000:
    repeat_number = 100


def test_convert_by_type_constructor():
    list(iter(range(size)))


def test_convert_by_list_comprehension():
    [e for e in iter(range(size))]


def test_convert_by_unpacking():
    [*iter(range(size))]


def get_avg_time_in_ms(func):
    avg_time = timeit.timeit(func, number=repeat_number) * 1000 / repeat_number
    return round(avg_time, 6)


funcs = [test_convert_by_type_constructor,
         test_convert_by_unpacking, test_convert_by_list_comprehension]

print(*map(get_avg_time_in_ms, funcs))

The script will be executed via a subprocess from a Jupyter Notebook (or a script), the size parameter will be passed through command-line arguments and the script results will be taken from standard output.

from subprocess import PIPE, run

import pandas

simple_data = {'constructor': [], 'unpacking': [], 'comprehension': [],
        'size': [], 'python version': []}


size_test = 100, 1000, 10_000, 100_000, 1_000_000
for version in ['3.6', '3.7', '3.8', '3.9']:
    print('test for python', version)
    for size in size_test:
        command = [f'python{version}', 'perf_test_convert_iterator.py', f'--size={size}']
        result = run(command, stdout=PIPE, stderr=PIPE, universal_newlines=True)
        constructor, unpacking,  comprehension = result.stdout.split()
        
        simple_data['constructor'].append(float(constructor))
        simple_data['unpacking'].append(float(unpacking))
        simple_data['comprehension'].append(float(comprehension))
        simple_data['python version'].append(version)
        simple_data['size'].append(size)

df_ = pandas.DataFrame(simple_data)
df_

enter image description here

You can get my full notebook from here.

In most of the cases, in my tests, unpacking shows to be faster, but the difference is so small that the results may change from a run to the other. Again, the comprehension approach is the slowest, in fact, the other 2 methods are up to ~ 60% faster.

Related questions
                            
                                What is the source code of the "this" module doing?
                            
                                Object of custom type as dictionary key
                            
                                How to get exit code when using Python subprocess communicate method?
                            
                                Understanding __getitem__ method
                            
                                Bulk package updates using Conda
                            
                                How to find which columns contain any NaN value in Pandas dataframe
                            
                                Replacing blank values (white space) with NaN in pandas
                            
                                How to create a temporary directory and get its path/ file name?
                            
                                What exactly does the .join() method do?
                            
                                What is the best way to implement nested dictionaries?
                            
                                Convert string to Python class object?
                            
                                Python logging: use milliseconds in time format
                            
                                Python3: ImportError: No module named '_ctypes' when using Value from module multiprocessing
                            
                                How do I print bold text in Python?
                            
                                Why can't Python's raw string literals end with a single backslash?
                            
                                warning about too many open figures
                            
                                How to put individual tags for a matplotlib scatter plot?
                            
                                Parsing HTML using Python
                            
                                Creating Threads in python
                            
                                Does uninstalling a package with "pip" also remove the dependent packages?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Fastest way to convert an iterator to a list

Tags:

python

iterator

list-comprehension

People also ask

Recent Activity

Donate For Us