Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Initialize dictionary from transforming an array

Tags:

swift

Is there a way to declaratively initialize a dictionary from an array is swift? I'm looking for something like this:

struct MyStruct {

   var key: Int
   var value: String
}

let array = [MyStruct(key: 0, value: "a"), MyStruct(key: 1, value: "b")]
let dict = array.someTransform { // Some arguments
   // Some transformation
}

So that dict is of type [Int: String]?

Note: I'm not looking for a solution including forEach since from this task's point of view it's just a more sophisticated version of for loop.

like image 272
user3581248 Avatar asked May 29 '17 14:05

user3581248


People also ask

How do you create an array of dictionaries in Python?

In Python to convert an array of dictionaries to a dataframe, we can easily use the function dict. items(). By using dict. items() to get a set like a dictionary with the key-value pairs.

What is the difference between an array and a dictionary?

Anyway, an Array generally provides random access to a sequential set of data, whereas a Dictionary is used to map between a set of keys and a set of values (so, very useful in mapping random "pairs" of information).


3 Answers

Dictionary's sequence initialiser

In Swift 4, assuming that the keys are guaranteed to be unique, you can simply say:

let array = [MyStruct(key: 0, value: "a"), MyStruct(key: 1, value: "b")]

let dict = Dictionary(uniqueKeysWithValues: array.lazy.map { ($0.key, $0.value) })

print(dict) // [0: "a", 1: "c"]

This is using the init(uniqueKeysWithValues:) initialiser from SE-0165. It expects a sequence of key-value tuples, where the keys are guaranteed to be unique (you'll get a fatal error if they aren't). So in this case, we're applying a lazy transform to the elements in your array in order to get a lazy collection of key-value pairs.

If the keys aren't guaranteed to be unique, you'll need some way of deciding which of the possible values to use for the given key. To do this, you can use the init(_:uniquingKeysWith:) initialiser from the same proposal, and pass a given function to determine which value to use for a given key upon a duplicate key arising.

The first argument to the uniquingKeysWith: function is the value that's already in the dictionary, the second is the value attempting to be inserted.

For example, here we're overwriting the value each time a duplicate key occurs in the sequence:

let array = [MyStruct(key: 0, value: "a"), MyStruct(key: 0, value: "b"),
             MyStruct(key: 1, value: "c")]

let keyValues = array.lazy.map { ($0.key, $0.value) }
let dict = Dictionary(keyValues, uniquingKeysWith: { _, latest in latest })

print(dict) // [0: "b", 1: "c"]

To keep the first value for a given key, and ignore any subsequent values for the same key, you'd want a uniquingKeysWith: closure of { first, _ in first }, giving a result of [0: "a", 1: "c"] in this case.


Reduce with an inout accumulator

Another possible option in Swift 4, assuming you wish to merge any duplicate keys by overwriting the value at each occurrence of the given key is to use reduce(into:_:), introduced in SE-0171.

Unlike reduce(_:_:), this method uses an inout parameter for the accumulator in the combination function. This allows it to avoid the unnecessary copying of the accumulator that would otherwise occur at each iteration of reduce(_:_:) when populating a dictionary accumulator. This therefore allows us to populate it in linear, rather than quadratic time.

You can use it like so:

let array = [MyStruct(key: 0, value: "a"), MyStruct(key: 0, value: "b"),
             MyStruct(key: 1, value: "c")]

let dict = array.reduce(into: [:]) { $0[$1.key] = $1.value }

print(dict) // [0: "b", 1: "c"]


// with initial capacity to avoid resizing upon populating.
let dict2 = array.reduce(into: Dictionary(minimumCapacity: array.count)) { dict, element in
    dict[element.key] = element.value
}

print(dict2) // [0: "b", 1: "c"]
like image 152
Hamish Avatar answered Oct 18 '22 22:10

Hamish


using reduce

let dict = array.reduce([:]) { (d, s) -> [Int:String] in
    var d = d
    d[s.key] = s.value
    return d
}

as mentioned by @Martin R, this is not the best performer, but very easy to use. @Hamish's extension is nice, at least the same performance give you a little bit simpler

var dict:[Int:String] = [:]
for s in array {
    dict[s.key] = s.value
}

Yes, I see, that you would like to avoid forEach version, but in reality, it is a good and powerful solution.

var dict:[Int:String] = [:]
array.forEach {
    dict[$0.key] = $0.value
}

Making things as simple as possible reducing a chance to make unwanted side effect (bug)

Defining minimum capacity

var dict = Dictionary<Int,String>(minimumCapacity: array.count)
array.forEach {
    dict[$0.key] = $0.value
}

you have the best performer.

to compare the solutions

do {
    let start = Date()
    let dict = Dictionary(uniqueKeysWithValues: array.lazy.map { ($0.key, $0.value) })
    let time = start.timeIntervalSince(Date())
    print(1,time, dict.count)
}

do {
    let start = Date()
    var dict = Dictionary<Int,String>(minimumCapacity: array.count)
    array.forEach {
        dict[$0.key] = $0.value
    }
    let time = start.timeIntervalSince(Date())
    print(2,time, dict.count)
}

it prints on my computer

1 -1.93269997835159 10000000
2 -1.80712699890137 10000000

I like the idea of @Hamish using inout parameter for the accumulating function. I tested it with the same data set

do {
    let start = Date()
    let dict = array.reduce(into: Dictionary(minimumCapacity: array.count)) { dict, element in
        dict[element.key] = element.value
    }
    let time = start.timeIntervalSince(Date())
    print(3,time, dict.count)
}

I expected the same performance as the others above but unfortunately, it prints

3 -3.80046594142914 10000000

It looks like it needs cc twice a time to perform the same job.

like image 35
user3441734 Avatar answered Oct 18 '22 20:10

user3441734


You could also extend the Dictionary type to add an initializer that accepts an array of tuples:

extension Dictionary
{
  init(_ keyValues:[(Key,Value)] )
  {
     self.init(minimumCapacity: keyValues.underestimatedCount) // self.init()
     for (key,value) in keyValues { self[key] = value }
  }
}

struct MyStruct {

   var key: Int
   var value: String
}

let array = [MyStruct(key: 0, value: "a"), MyStruct(key: 1, value: "b")]
let dict = Dictionary(array.map{($0.key,$0.value)})

The for loop would still be there but only within the Dictionary type and would not require boiler plate code to build it form an array in the various places where you need such an initialization.

[EDIT] changed init to use minimum capacity as suggested by 3441734. This should make it as fast as #1. I feel, however that this optimization sacrifices a bit of simplicity for the sake of a very rare use case where such initializations would be a key performance factor.

like image 1
Alain T. Avatar answered Oct 18 '22 20:10

Alain T.