Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most Efficient Method to Concatenate Strings in Python

At the time of asking this question, I'm using Python 3.8

When I say efficient, I'm only referring to the speed at which the strings are concatenated, or in more technical terms: I'm asking about the time complexity, not accounting the space complexity.

The only methods I can think of at the moment are the following 3 given that:

a = 'start'
b = ' end'

Method 1

result = a + b

Method 2

result = ''.join((a, b))

Method 3

result = '{0}{1}'.format(a, b)

I want to know which of these methods are faster, or if there are other methods that are more efficient. Also, if you know if either of these methods performs differently with more strings or longer strings, please include that in your answer.

Edit

After seeing all the comments and answers, I have learned a couple of new ways to concatenate strings, and I have also learned about the timeit library. I will report my personal findings below:

>>> import timeit

>>> print(timeit.Timer('result = a + b', setup='a = "start"; b = " end"').timeit(number=10000))
0.0005306000000473432

>>> print(timeit.Timer('result = "".join((a, b))', setup='a = "start"; b = " end"').timeit(number=10000))
0.0011297000000354274

>>> print(timeit.Timer('result = "{0}{1}".format(a, b)', setup='a = "start"; b = " end"').timeit(number=10000))
0.002327799999989111

>>> print(timeit.Timer('result = f"{a}{b}"', setup='a = "start"; b = " end"').timeit(number=10000))
0.0005772000000092703

>>> print(timeit.Timer('result = "%s%s" % (a, b)', setup='a = "start"; b = " end"').timeit(number=10000))
0.0017815999999584164

It seems that for these small strings, the traditional a + b method is the fastest for string concatenation. Thanks for all of the answers!

like image 382
VoidTwo Avatar asked Apr 23 '20 16:04

VoidTwo


2 Answers

Why don't you try it out? You can use timeit.timeit() to run a statement many times and return the overall duration.

Here, we use s to setup the variables a and b (not included in the overall time), and then run the various options 10 million times.

>>> from timeit import timeit
>>>
>>> n = 10 * 1000 * 1000
>>> s = "a = 'start'; b = ' end'"
>>>
>>> timeit("c = a + b",                 setup=s, number=n)
0.4452877212315798
>>>
>>> timeit("c = f'{a}{b}'",             setup=s, number=n)
0.5252049304544926
>>>
>>> timeit("c = '%s%s'.format(a, b)",   setup=s, number=n)
0.6849184390157461
>>>>
>>> timeit("c = ''.join((a, b))",       setup=s, number=n)
0.8546998891979456
>>>
>>> timeit("c = '%s%s' % (a, b)",       setup=s, number=n)
1.1699129864573479
>>>
>>> timeit("c = '{0}{1}'.format(a, b)", setup=s, number=n)
1.5954962372779846

This shows that unless your application's bottleneck is string concatenation, it's probably not worth being too concerned about...

  • The best case is ~0.45 seconds for 10 million iterations, or about 45ns per operation.
  • The worst case is ~1.59 seconds for 10 million iterations, or about 159ns per operation.

If you're performing literally millions of operations, you'll see a speed improvement of about 1 second.

Note that your results may vary quite drastically depending on the lengths (and number) of the strings you're concatenating, and the hardware you're running on.

like image 77
Attie Avatar answered Oct 02 '22 17:10

Attie


For exactly two strings a and b, just use a + b. The alternatives are for joining more than 2 strings, avoiding the temporary str object created by each use of +, as well as the quadratic behavior due to repeatedly copying the contents of earlier operations in the next result.

(There's also f'{a}{b}', but it's syntactically heavier and no faster than a + b.)

like image 29
chepner Avatar answered Oct 02 '22 15:10

chepner