Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can't I store string keys in an Associative Array?

I'm new to D programming language, just started reading The D Programming Language book.

I run into error when trying one associative array example code

#!/usr/bin/rdmd
import std.stdio, std.string;

void main() {
    uint[string] dict;
    foreach (line; stdin.byLine()) {
        foreach (word; splitter(strip(line))) {
            if (word in dict) continue;
            auto newId = dict.length;
            dict[word] = newId;
            writeln(newId, '\t', word);
        }   
    }   
}

DMD shows this Error message:

./vocab.d(11): Error: associative arrays can only be assigned values with immutable keys, not char[]

I'm using DMD compile 2.051

I was guessing the rules for associative arrays has changed since the TDPL book.

How should I use Associative arrays with string keys?

Thanks.

Update:

I found the solution in later parts of the book.

use string.idup to make a duplicate immutable value before putting into the array.

so

dict[word.idup] = newId;

would do the job.

But is that efficient ?

like image 210
Visus Zhao Avatar asked Jan 06 '11 03:01

Visus Zhao


People also ask

What type of data is stored in associative array?

An associative array data type is a data type used to represent a generalized array with no predefined cardinality. Associative arrays contain an ordered set of zero or more elements of the same data type, where each element is ordered by and can be referenced by an index value.

What is key in associative array?

An associative array is an array with string keys rather than numeric keys. Associative arrays are dynamic objects that the user redefines as needed. When you assign values ​​to keys in a variable of type Array, the array is transformed into an object, and it loses the attributes and methods of Array.

Which built in Cannot use associative array?

Associative array values cannot be stored in table columns.

Does JavaScript support associative array?

JavaScript does not support associative arrays. You should use objects when you want the element names to be strings (text). You should use arrays when you want the element names to be numbers.


2 Answers

Associative arrays require that their keys be immutable. It makes sense when you think about the fact that if it's not immutable, then it might change, which means that its hash changes, which means that when you go to get the value out again, the computer won't find it. And if you go to replace it, you'll end up with another value added to the associative array (so, you'll have one with the correct hash and one with an incorrect hash). However, if the key is immutable, it cannot change, and so there is no such problem.

Prior to dmd 2.051, the example worked (which was a bug). It has now been fixed though, so the example in TDPL is no longer correct. However, it's not so much the case that the rules for associative arrays have changed as that there was a bug in them which was not caught. The example compiled when it shouldn't have, and Andrei missed it. It's listed in the official errata for TDPL and should be fixed in future printings.

The corrected code should use either dictionary[word.idup] or dictionary[to!string(word)]. word.idup creates a duplicate of word which is immutable. to!string(word), on the other hand converts word to a string in the most appropriate manner. As word is a char[] in this case, that would be to use idup. However, if word were already a string, then it would simply return the value which was passed in and not needlessly copy it. So, in the general case, to!string(word) is the better choice (particularly in templated functions), but in this case, either works just fine (to!() is in std.conv).

It is technically possible to cast a char[] to a string, but it's generally a bad idea. If you know that the char[] will never change, then you can get away with it, but in the general case, you're risking problems, since the compiler will then assume that the resulting string can never change, and it could generate code which is incorrect. It may even segfault. So, don't do it unless profiling shows that you really need the extra efficiency of avoiding the copy, you can't otherwise avoid the copy by doing something like just using a string in the first place (so no conversion would be necessary), and you know that the string will never be changed.

In general, I wouldn't worry too much of the efficiency of copying strings. Generally, you should be using string instead of char[], so you can copy them around (that is copy their reference around (e.g. str1 = str2;) rather than copying their entire contents like dup and idup do) without worrying about it being particularly inefficient. The problem with the example is that stdin.byLine() returns a char[] rather than a string (presumably to avoid copying the data if its not necessary). So, splitter() returns a char[], and so word is a char[] instead of a string. Now, you could do splitter(strip(line.idup)) or splitter(strip(line).idup) instead of iduping the key. That way, splitter() would return a string rather than char[], but that's probably essentially just as efficient as iduping word. Regardless, because of where the text is coming from originally, it's a char[] instead of a string, which forces you to idup it somewhere along the line if you intend to use it as a key in an associative array. In the general case, however, it's better to just use string and not char[]. Then you don't need to idup anything.

EDIT:
Actually, even if you find a situation where casting from char[] to string seems both safe and necessary, consider using std.exception.assumeUnique() (documentation). It's essentially the preferred way of converting a mutable array to an immutable one when you need to and know that you can. It would typically be done in cases where you've constructed an array which you couldn't make immutable because you had to do it in pieces but which has no other references, and you don't want to create a deep copy of it. It wouldn't be useful in situations like the example that you're asking about though, since you really do need to copy the array.

like image 174
Jonathan M Davis Avatar answered Nov 01 '22 13:11

Jonathan M Davis


No, it's not efficient, since it obviously duplicates the string. If you can guarantee that the string you create will never be modified in memory, feel free to explicitly use a cast cast(immutable)str on it, instead of duplicating it.

(Although, I've noticed that the garbage collector works well, so I suggest you don't actually try that unless you see a bottleneck, since you might decide to change the string later. Just place a comment in your code to help you find the bottleneck later, if it exists.)

like image 25
user541686 Avatar answered Nov 01 '22 13:11

user541686