Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Swift semantics regarding dictionary access

Tags:

swift

I'm currently reading the excellent Advanced Swift book from objc.io, and I'm running into something that I don't understand.

If you run the following code in a playground, you will notice that when modifying a struct contained in a dictionary a copy is made by the subscript access, but then it appears that the original value in the dictionary is replaced by the copy. I don't understand why. What exactly is happening ?

Also, is there a way to avoid the copy ? According to the author of the book, there isn't, but I just want to be sure.

import Foundation

class Buffer {
    let id = UUID()
    var value = 0

    func copy() -> Buffer {
        let new = Buffer()
        new.value = self.value
        return new
    }
}

struct COWStruct {
    var buffer = Buffer()

    init() { print("Creating \(buffer.id)") }

    mutating func change() -> String {
        if isKnownUniquelyReferenced(&buffer) {
            buffer.value += 1
            return "No copy \(buffer.id)"
        } else {
            let newBuffer = buffer.copy()
            newBuffer.value += 1
            buffer = newBuffer
            return "Copy \(buffer.id)"
        }
    }
}

var array = [COWStruct()]
array[0].buffer.value
array[0].buffer.id
array[0].change()
array[0].buffer.value
array[0].buffer.id


var dict = ["key": COWStruct()]
dict["key"]?.buffer.value
dict["key"]?.buffer.id
dict["key"]?.change()
dict["key"]?.buffer.value
dict["key"]?.buffer.id

// If the above `change()` was made on a copy, why has the original value changed ?
// Did the copied & modified struct replace the original struct in the dictionary ?

enter image description here

like image 903
deadbeef Avatar asked Jun 19 '17 13:06

deadbeef


Video Answer


1 Answers

dict["key"]?.change() // Copy

is semantically equivalent to:

if var value = dict["key"] {
    value.change() // Copy
    dict["key"] = value
} 

The value is pulled out of the dictionary, unwrapped into a temporary, mutated, and then placed back into the dictionary.

Because there's now two references to the underlying buffer (one from our local temporary value, and one from the COWStruct instance in the dictionary itself) – we're forcing a copy of the underlying Buffer instance, as it's no longer uniquely referenced.

So, why doesn't

array[0].change() // No Copy

do the same thing? Surely the element should be pulled out of the array, mutated and then stuck back in, replacing the previous value?

The difference is that unlike Dictionary's subscript which comprises of a getter and setter, Array's subscript comprises of a getter and a special accessor called mutableAddressWithPinnedNativeOwner.

What this special accessor does is return a pointer to the element in the array's underlying buffer, along with an owner object to ensure that the buffer isn't deallocated from under the caller. Such an accessor is called an addressor, as it deals with addresses.

Therefore when you say:

array[0].change()

you're actually mutating the actual element in the array directly, rather than a temporary.

Such an addressor cannot be directly applied to Dictionary's subscript because it returns an Optional, and the underlying value isn't stored as an optional. So it currently has to be unwrapped with a temporary, as we cannot return a pointer to the value in storage.

In Swift 3, you can avoid copying your COWStruct's underlying Buffer by removing the value from the dictionary before mutating the temporary:

if var value = dict["key"] {
    dict["key"] = nil
    value.change() // No Copy
    dict["key"] = value
}

As now only the temporary has a view onto the underlying Buffer instance.

And, as @dfri points out in the comments, this can be reduced down to:

if var value = dict.removeValue(forKey: "key") {
    value.change() // No Copy
    dict["key"] = value
}

saving on a hashing operation.

Additionally, for convenience, you may want to consider making this into an extension method:

extension Dictionary {
  mutating func withValue<R>(
    forKey key: Key, mutations: (inout Value) throws -> R
  ) rethrows -> R? {
    guard var value = removeValue(forKey: key) else { return nil }
    defer {
      updateValue(value, forKey: key)
    }
    return try mutations(&value)
  }
}

// ...

dict.withValue(forKey: "key") {
  $0.change() // No copy
}

In Swift 4, you should be able to use the values property of Dictionary in order to perform a direct mutation of the value:

if let index = dict.index(forKey: "key") {
    dict.values[index].change()
}

As the values property now returns a special Dictionary.Values mutable collection that has a subscript with an addressor (see SE-0154 for more info on this change).

However, currently (with the version of Swift 4 that ships with Xcode 9 beta 5), this still makes a copy. This is due to the fact that both the Dictionary and Dictionary.Values instances have a view onto the underlying buffer – as the values computed property is just implemented with a getter and setter that passes around a reference to the dictionary's buffer.

So when calling the addressor, a copy of the dictionary's buffer is triggered, therefore leading to two views onto COWStruct's Buffer instance, therefore triggering a copy of it upon change() being called.

I have filed a bug over this here. (Edit: This has now been fixed on master with the unofficial introduction of generalised accessors using coroutines, so will be fixed in Swift 5 – see below for more info).


In Swift 4.1, Dictionary's subscript(_:default:) now uses an addressor, so we can efficiently mutate values so long as we supply a default value to use in the mutation.

For example:

dict["key", default: COWStruct()].change() // No copy

The default: parameter uses @autoclosure such that the default value isn't evaluated if it isn't needed (such as in this case where we know there's a value for the key).


Swift 5 and beyond

With the unofficial introduction of generalised accessors in Swift 5, two new underscored accessors have been introduced, _read and _modify which use coroutines in order to yield a value back to the caller. For _modify, this can be an arbitrary mutable expression.

The use of coroutines is exciting because it means that a _modify accessor can now perform logic both before and after the mutation. This allows them to be much more efficient when it comes to copy-on-write types, as they can for example deinitialise the value in storage while yielding a temporary mutable copy of the value that's uniquely referenced to the caller (and then reinitialising the value in storage upon control returning to the callee).

The standard library has already updated many previously inefficient APIs to make use of the new _modify accessor – this includes Dictionary's subscript(_:) which can now yield a uniquely referenced value to the caller (using the deinitialisation trick I mentioned above).

The upshot of these changes means that:

dict["key"]?.change() // No copy

will be able to perform an mutation of the value without having to make a copy in Swift 5 (you can even try this out for yourself with a master snapshot).

like image 51
Hamish Avatar answered Oct 30 '22 14:10

Hamish