Concatenating strings faster than appending to lists

Question

I am trying to isolate specific items in a list (e.g. [0, 1, 1] will return [0, 1]). I managed to get through this, but I noticed something strange.

When I tried to append to list it ran about 7 times slower then when I was concatenating strings and then splitting it.

This is my code:

import time
start = time.time()

first = [x for x in range(99999) if x % 2 == 0]
second = [x for x in range(99999) if x % 4 == 0]

values = first + second

distinct_string = ""

for i in values:
    if not str(i) in distinct_string:
        distinct_string += str(i) + " "

print(distinct_string.split())

print(" --- %s sec --- " % (start - time.time()))

This result end in about 5 seconds... Now for the lists:

import time
start = time.time()

first = [x for x in range(99999) if x % 2 == 0]
second = [x for x in range(99999) if x % 4 == 0]

values = first + second

distinct_list = []

for i in values:
    if not i in distinct_list:
        distinct_list.append(i)

print(distinct_list)

print(" --- %s sec --- " % (start - time.time()))

Runs at around 40 seconds.

What makes string faster even though I am converting a lot of values to strings?

jonrsharpe · Accepted Answer

Note that it's generally better to use timeit to compare functions, which runs the same thing multiple times to get average performance, and to factor out repeated code to focus on the performance that matters. Here's my test script:

first = [x for x in range(999) if x % 2 == 0]
second = [x for x in range(999) if x % 4 == 0]

values = first + second

def str_method(values):
    distinct_string = ""
    for i in values:
        if not str(i) in distinct_string:
            distinct_string += str(i) + " "
    return [int(s) for s in distinct_string.split()]

def list_method(values):
    distinct_list = []
    for i in values:
        if not i in distinct_list:
            distinct_list.append(i)
    return distinct_list

def set_method(values):
    seen = set()
    return [val for val in values if val not in seen and seen.add(val) is None]

if __name__ == '__main__':
    assert str_method(values) == list_method(values) == set_method(values)
    import timeit
    funcs = [func.__name__ for func in (str_method, list_method, set_method)]
    setup = 'from __main__ import {}, values'.format(', '.join(funcs))
    for func in funcs:
        print(func)
        print(timeit.timeit(
            '{}(values)'.format(func),
            setup=setup,
            number=1000
        ))

I've added int conversion to make sure that the functions return the same thing, and get the following results:

str_method
1.1685157899992191
list_method
2.6124089090008056
set_method
0.09523714500392089

Note that it is not true that searching in a list is faster than searching in a string if you have to convert the input:

>>> timeit.timeit('1 in l', setup='l = [9, 8, 7, 6, 5, 4, 3, 2, 1]')
0.15300405000016326
>>> timeit.timeit('str(1) in s', setup='s = "9 8 7 6 5 4 3 2 1"')
0.23205067300295923

Repeated appending to a list is not very efficient, as it means frequent resizing of the underlying object - the list comprehension, as shown in the set version, is more efficient.

LetzerWille · Answer

searching in strings:

if not str(i) in distinct_string:

is much faster

then searching in lists

if not i in distinct_list:

here are lprofile lines for string search in OP code

Line #      Hits         Time  Per Hit   % Time      Line Contents 


    17     75000     80366013   1071.5     92.7       if not str(i) in distinct_string:
    18     50000      2473212     49.5      2.9                  distinct_string += str(i) + " "

and for list search in OP code

   39     75000    769795432  10263.9     99.1          if not i in distinct_list:
   40     50000      2813804     56.3      0.4              distinct_list.append(i)

Concatenating strings faster than appending to lists

Tags:

performance

python

string

list

python-3.x

Alekskill124

2 Answers

jonrsharpe

LetzerWille

Recent Activity

Donate For Us

Concatenating strings faster than appending to lists

Tags:

performance

python

string

list

python-3.x

Alekskill124

2 Answers

jonrsharpe

LetzerWille

Related questions

Recent Activity

Donate For Us