Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to prove "copy-on-write" on String type in Swift

As the title said, I tried to prove myself that COW(copy on write) is supported for String in Swift. But I cannot find a proof. I proved the COW on Array and Dictionary after trying the following codes:

func address(of object: UnsafeRawPointer) -> String {
    let addr = Int(bitPattern: object)
    return String(format: "%p", addr)
}

var xArray = [20, 30, 40, 50, 60]
var yArray = xArray

// These two addresses were the same
address(of: xArray) 
address(of: yArray)

yArray[0] = 200
// The address of yArray got changed
address(of: yArray)

But for String type, it was not working.

var xString = "Hello World"
var yString = xString

// These two addresses were different
address(of: xString)
address(of: yString)

And I dumped the test function from the official Swift code repo.

func _rawIdentifier(s: String) -> (UInt, UInt) {
    let tripe = unsafeBitCast(s, to: (UInt, UInt, UInt).self)
    let minusCount = (tripe.0, tripe.2)
    return minusCount
}

But this function seems to only cast the actual value pointed to not the address. So two different String variables with the same value would have the same rawIdentifier. Still cannot prove COW to me.

var xString = "Hello World"
var yString = "Hello" + " World" 

// These two rawIdentifiers were the same
_rawIdentifier(s: xString)
_rawIdentifier(s: yString)

So how does COW work on String type in Swift?

like image 490
Wu_ Avatar asked Oct 14 '17 17:10

Wu_


1 Answers

The compiler creates only a single storage for both "Hello World" and "Hello" + " World".

You can verify that for example by examining the assembly code obtained from

swiftc -emit-assembly cow.swift

which defines only a single string literal

    .section    __TEXT,__cstring,cstring_literals
L___unnamed_1:
    .asciz  "Hello World"

As soon as the string is mutated, the address of the string storage buffer (the first member of that "magic" tuple, actually _baseAddress of struct _StringCore, defined in StringCore.swift) changes:

var xString = "Hello World"
var yString = "Hello" + " World"

print(_rawIdentifier(s: xString)) // (4300325536, 0)
print(_rawIdentifier(s: yString)) // (4300325536, 0)

yString.append("!")
print(_rawIdentifier(s: yString)) // (4322384560, 4322384528)

And why does your

func address(of object: UnsafeRawPointer) -> String

function show the same values for xArray and yArray, but not for xString and yString?

Passing an array to a function taking a unsafe pointer passes the address of the first array element, that is the same for both arrays if they share the storage.

Passing a string to a function taking an unsafe pointer passes a pointer to a temporary UTF-8 representation of the string. That address can be different in each call, even for the same string.

This behavior is documented in the "Using Swift with Cocoa and Objective-C" reference for UnsafePointer<T> arguments, but apparently works the same for UnsafeRawPointer arguments.

like image 175
Martin R Avatar answered Oct 09 '22 13:10

Martin R