Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

str = str + "abc" slower than str = "abc" + str?

Can you believe it? I have a loop like this (forgive any errors, I had to heavily redact a lot of info and variable names, trust me it works).

...Old example edited out, see below code...

And if I change those middle str = "Blah \(odat.count)" + str type lines to str = str + "Blah \(odat.count)" the UI grinds to a halt and I get colour wheel. The NSTextField does get to the first self.display.string... but then freezes.

I'm a multithreading novice so please feel free to correct my method. Hopefully it's clear what I want.

I have to admit the working version is also a little stuttery but never actually freezes. Typical values are n = 70, var3 = 7.

EDIT:

Here is a fully working example. Just link up the textview, progress bar, and button(s). Try changing between the main functions.

//
//  Controllers.swift
//
//

import Cocoa

class MainController: NSObject {

    @IBOutlet var display: NSTextView!
    @IBOutlet weak var prog: NSProgressIndicator!

    @IBAction func go1(sender: AnyObject) {
        theRoutine(70)
    }

    @IBAction func go2(sender: AnyObject) {
        theRoutine(50)
    }

    class SomeClass {
        var x: Int
        var y: Int
        var p: Double

        init?(size: Int, pro: Double) {
            x = size
            y = size
            p = pro
        }
    }

    func theRoutine(n: Int) {
        prog.hidden = false
        prog.doubleValue = 0
        prog.maxValue = 7 * 40
        let priority = DISPATCH_QUEUE_PRIORITY_HIGH
        dispatch_async(dispatch_get_global_queue(priority, 0)) {
            self.theFunc(n, var1: 0.06, var2: 0.06, var3: 7)
            self.theFunc(n, var1: 0.1*log(Double(n))/Double(n), var2: 0.3*log(Double(n))/Double(n), var3: 7)
            dispatch_async(dispatch_get_main_queue()) {
                self.prog.hidden = true
                self.appOut("done!")
            }
        }
    }

    //This doesn't
//  func theFunc(n: Int, var1: Double, var2: Double, var3: Int) {
//      var m: AnEnum
//      var gra: SomeClass
//      var p = var1
//      for _ in 0...(var3 - 1) {
//          var str  = "blah \(p)\n"
//          for _ in 1...20 {
//              gra = SomeClass(size: n, pro: p)!
//              m = self.doSomething(gra)
//              switch m {
//              case .First(let dat):
//                  str = str + "Blah:\n\(self.arrayF(dat, transform: {"blah\($0)blah\($1)=blah"}))" + "\n\n" + str
//              case .Second(let odat):
//                  str = str + "Blah\(odat.count) blah\(self.arrayF(odat, transform: {"bl\($1)"}))" + "\n\n" + str
//              }
//              dispatch_async(dispatch_get_main_queue()) {
//                  self.prog.incrementBy(1)
//              }
//          }
//          dispatch_async(dispatch_get_main_queue()) {
//              // update some UI
//              self.display.string = str + "\n" + (self.display.string ?? "")
//          }
//          p += var2
//      }
//  }

    //This works
    func theFunc(n: Int, var1: Double, var2: Double, var3: Int) {
        var m: AnEnum
        var gra: SomeClass
        var p = var1
        for _ in 0...(var3 - 1) {
            var str  = "blah \(p)\n"
            for _ in 1...20 {
                gra = SomeClass(size: n, pro: p)!
                m = self.doSomething(gra)
                switch m {
                case .First(let dat):
                    str = "Blah:\n\(self.arrayF(dat, transform: {"blah\($0)blah\($1)=blah"}))" + "\n\n" + str
                case .Second(let odat):
                    str = "Blah\(odat.count) blah\(self.arrayF(odat, transform: {"bl\($1)"}))" + "\n\n" + str
                }
                dispatch_async(dispatch_get_main_queue()) {
                    self.prog.incrementBy(1)
                }
            }
            dispatch_async(dispatch_get_main_queue()) {
                // update some UI
                self.display.string = str + "\n" + (self.display.string ?? "")
            }
            p += var2
        }
    }

    func doSomething(G: SomeClass) -> AnEnum {
        usleep(30000)
        if drand48() <= G.p {
            return AnEnum.First([0, 0])
        } else {
            return AnEnum.Second([1, 1, 1])
        }
    }

    enum AnEnum {
        case First([Int])
        case Second([Int])
    }

    func appOut(out: String?) {
        if out != nil {
            display.string = out! + "\n\n" + (display.string ?? "")
        }
    }

    func arrayF(array: [Int], transform: (index: Int, value: Int) -> String) -> String {
        let arr = Array(0...(array.count - 1))
        return "[\(arr.map{transform(index: $0, value: array[$0])}.joinWithSeparator(", "))]"
    }
}
like image 309
Richard Birkett Avatar asked Jan 23 '26 20:01

Richard Birkett


2 Answers

Since you are not really asking a question other than

Can you believe it?

I will tell you that I surely can, but seriously there are a lot of cases when prepending something may be slower/faster than appending. Take for instance linked list. Prepend is O(1) and Append is O(N) if you are not holding reference to the last element of the list.

I put together gist that just times that particular issue , and on 5-6 runs it does not seem to be significant difference , however prepend is still 10% slower on my machine.

What is interesting you basically have 4 ways of concatenating strings in Swift regarding your case:

  • prepend to accumulator str = newstr + str
  • append to accumulator str = str + newstr
  • append-mutate str.append(newstr)
  • use good-old array as a string buffer and join it all at once a = []; a.append(x); str = a.joined(separator: " ")

On my machine they seems to all deviate around pretty much the same time with typical timing like this:

prepend
real    0m0.082s
user    0m0.060s
sys 0m0.018s

append
real    0m0.070s
user    0m0.049s
sys 0m0.018s

append mutate
real    0m0.075s
user    0m0.054s
sys 0m0.019s

join
real    0m0.086s
user    0m0.064s
sys 0m0.020s

Append being fastest.

You can see code for all four cases in my gist https://gist.github.com/ojosdegris/df72a94327d12a67fe65e5989f9dcc53

If you check out Swift sources on Github, you'll see this:

@effects(readonly)
@_semantics("string.concat")
public static func + (lhs: String, rhs: String) -> String {
 if lhs.isEmpty {
  return rhs
 }
 var lhs = lhs
 lhs._core.append(rhs._core)
 return lhs
}

So what is probably happening as accumulator string grows it is more expensive to copy it.

like image 138
vittore Avatar answered Jan 25 '26 21:01

vittore


Vittore's answer is correct. Looking at Swift's source code for String (stdlib/public/core/String.swift), we can see:

Although strings in Swift have value semantics, strings use a copy-on-write strategy to store their data in a buffer. This buffer can then be shared by different copies of a string. A string's data is only copied lazily, upon mutation, when more than one string instance is using the same buffer. Therefore, the first in any sequence of mutating operations may cost O(n) time and space.

When a string's contiguous storage fills up, a new buffer must be allocated and data must be moved to the new storage. String buffers use an exponential growth strategy that makes appending to a string a constant time operation when averaged over many append operations.

Per Wikipedia's Copy-on-write:

If a resource is duplicated but not modified, it is not necessary to create a new resource; the resource can be shared between the copy and the original.

With this in mind, when doing str = str + "abc", the compiler is performing a shallow copy of str and appending "abc" to its contiguous memory. On the other hand, str = "abc" + str creates an independent instance with its own unique copy of data since it isn't using contiguous memory anymore.

like image 41
Nick G Avatar answered Jan 25 '26 20:01

Nick G



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!