Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

std.algorithm.joiner(string[],string) - why result elements are dchar and not char?

Tags:

d

dmd

phobos

I try to compile following code:

import std.algorithm;
void main()
{
    string[] x = ["ab", "cd", "ef"]; // 'string' is same as 'immutable(char)[]'
    string space = " ";
    char z = joiner( x, space ).front(); // error
}

Compilation with dmd ends with error:

 test.d(8): Error: cannot implicitly convert expression (joiner(x,space).front()) of type dchar to char

Changing char z to dchar z does fix the error message, but I'm interested why it appears in the first place.

Why result of joiner(string[],string).front() is dchar and not char?

(There is nothing on this in documentation http://dlang.org/phobos/std_algorithm.html#joiner)

like image 663
dnsmkl Avatar asked Sep 05 '12 19:09

dnsmkl


1 Answers

All strings are treated as ranges of dchar. That's because a dchar is guaranteed to be a single code point, since in UTF-32, every code unit is a code point, whereas in UTF-8 (char) and UTF-16 (wchar), the number of code units per code point varies. So, if you were operating on individual chars or wchars, you'd be operating on pieces of characters rather than whole characters, which would be very bad. If you don't know much about unicode, I'd advise reading this article by Joel Spolsky. It explains things fairly well.

In any case, because operating on individual chars and wchars doesn't make sense, strings of char and wchar are treated as ranges of dchar (ElementType!string is dchar), meaning that as far as ranges are concerned, they don't have length (hasLength!string is false - walkLength needs to be used to get their length), aren't sliceable (hasSlicing!string is false), and aren't indexable (isRandomAccess!string is false). This also means that anything which builds a new range from any kind of string is going to result in a range of dchar. joiner is one of those. There are some functions which understand unicode and special case strings for efficiency, taking advantage of length, slicing, and indexing where they can, but unless their result is ultimately a slice of the original, any range they return is going to have to be made of dchars.

So, front on any range of characters will always be dchar, and popFront will always pop off a full code point.

If you don't know much about ranges, I'd advise reading this. It's a chapter in a book on D which is online and is currently the best tutorial on ranges that we have. We really should get a proper article on ranges (including on how they work with strings) onto dlang.org, but no one's gotten around to writing it yet. Regardless, you're going to need to have at least a basic grasp of ranges to be able to use a lot of D's standard library (especially std.algorithm), because it uses them very heavily.

like image 147
Jonathan M Davis Avatar answered Nov 03 '22 06:11

Jonathan M Davis