Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are Lists faster than character arrays for string concatenation

Tags:

python

string

In the article linked below, the author compares the efficiency of different string concatenation methodologies in python: http://www.skymind.com/~ocrow/python_string/

One thing that I did not understand is, why does method 3 (Mutable Character Arrays) result in a significantly slower performance than method 4 (joining a list of strings)

both of them are mutable and I would think that they should have comparable performance.

like image 304
ThinkBonobo Avatar asked Oct 03 '22 02:10

ThinkBonobo


1 Answers

"Both of them are mutable" is misleading you a bit.

It's true that in the list-append method, the list is mutable. But building up the list isn't the slow part. If you have 1000 strings of average length 1000, you're doing 1000000 mutations to the array, but only 1000 mutations to the list (plus 1000 increfs to string objects).

In particular, that means the array will have to spend 1000x as much time expanding (allocating new storage and copying the whole thing so far).

The slow part for the list method is the str.join call at the end. But that isn't mutable, and doesn't require any expanding. It uses two passes, to first calculate the size needed, then copy everything into it.

Also, that code inside str.join has had (and has continued to have since that article was written 9 years ago) a lot of work to optimize it, because it's a very common, and recommended, idiom that many real programs depend on every day; array has barely been touched since it was first added to the language.

But if you really want to understand the differences, you have to look at the source. In 2.7, the main work for the array method is in array_fromstring, while the main work for the list method is in string_join. You can see how the latter takes advantage of the fact that we already know all of the strings we're going to be joining up at the start, while the former can't.

like image 173
abarnert Avatar answered Oct 08 '22 00:10

abarnert