Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is list join really faster than string concatenation in python?

I find that string concatenation seems to have less python bytecode than list join.

This is an example.

test.py:

a = ''.join(['a', 'b', 'c'])
b = 'a' + 'b' + 'c'

Then I execute python -m dis test.py. I got the following python bytecode (python 2.7):

  1           0 LOAD_CONST               0 ('')
              3 LOAD_ATTR                0 (join)
              6 LOAD_CONST               1 ('a')
              9 LOAD_CONST               2 ('b')
             12 LOAD_CONST               3 ('c')
             15 BUILD_LIST               3
             18 CALL_FUNCTION            1
             21 STORE_NAME               1 (a)

  3          24 LOAD_CONST               6 ('abc')
             27 STORE_NAME               2 (b)
             30 LOAD_CONST               4 (None)
             33 RETURN_VALUE  

Obviously, the bytecode number of string concatenation is less.It just load string 'abc' directly.

Can anyone explain why we always say that list join is much better?

like image 915
zhoutall Avatar asked Apr 22 '13 12:04

zhoutall


2 Answers

From Efficient String Concatenation in Python

Method 1 : 'a' + 'b' + 'c'

Method 6 : a = ''.join(['a', 'b', 'c'])

20,000 integers were concatenated into a string 86kb long :

pic

                Concatenations per second     Process size (kB)
  Method 1               3770                    2424
  Method 6               119,800                 3000

Conclusion : YES, str.join() is significantly faster then typical concatenation (str1+str2).

like image 71
Thanakron Tandavas Avatar answered Oct 21 '22 03:10

Thanakron Tandavas


Don't believe it! Always get proof!

Source: I stared at python source code for an hour and calculated complexities!

My findings.

For 2 strings. (Assume n is the length of both strings)

Concat (+) - O(n)
Join - O(n+k) effectively O(n)
Format - O(2n+k) effectively O(n)

For more than 2 strings. (Assume n is the length of all strings)

Concat (+) - O(n^2)
Join - O(n+k) effectively O(n)
Format - O(2n+k) effectively O(n)

RESULT:

If you have two strings technically concatenation (+) is better, effectively though it is exactly the same as join and format.

If you have more than two strings concat becomes awful and join and format are effectively the same though technically join is a bit better.

SUMMARY:

If you don't care for efficiency use any of the above. (Though since you asked the question I would assume you care)

Therefore -

If you have 2 strings use concat (when not in a loop!) If you have more than two strings (all strings) (or in a loop) use join If you have anything not strings use format, because duh.

Hope this helps!

like image 28
Tsukumo Avatar answered Oct 21 '22 04:10

Tsukumo