How to read a line as a range in D?
I know there is ranges in D, but I just wondered how to simply iterate over each character of a string using this concept?
To show what I'm after, the similar code in Go is:
for _, someChar := range someString {
// Do something
}
That would depend on whether you want to iterate over code units or code points. The language itself iterates over arrays by array elements, and strings are arrays of code units, so if you simply use foreach
with type inference, then with
foreach(c; "La Verité")
writeln(c);
the last two characters printed would be gibberish, because é
is a code point made up of two UTF-8 code units, and you're printing out individual code units (since char
is a UTF-8 code unit). Whereas, if you do
foreach(dchar c; "La Verité")
writeln(c);
then the runtime will decode the code units to code points, and é
will be printed as the last character. But none of this is really operating on strings as ranges. foreach
operates on arrays natively without having to use the input range API. However, for all string types, the range API looks like
@property bool empty();
@property dchar front();
void popFront();
It operates on strings as ranges of dchar
- not their code unit type. This avoids issues with functions like std.algorithm.filter
operating on individual code units, since that would make no sense. Operating on code points isn't 100% correct either, since Unicode gets very complicated with regards to combining code points and graphemes and whatnot, but operating on code points is far closer to being correct (and I believe there's work being done on adding range support for graphemes into the standard library for the cases where you need that and are willing to pay the performance hit). So, having the range API for strings operate on them as ranges of dchar
is far more correct, and if you did something like
foreach(c; filter!"true"("La Verité"))
writeln(c);
you would be iterating over dchar
, and é
would print correctly. The downside to all of this of course is the fact that foreach
on strings operates on the code unit level by default whereas the range API for strings operate on them as code points, so you have to be careful when mixing array operations and range-based operations on strings. That's also why string
and wstring
are not considered random-access ranges - just bidirectional ranges. You can't do random access in O(1) on code points when they're made up of varying numbers of code units (whereas dstring
is a random-access range, because with UTF-32, every code unit is a code point).
foreach(ch; str)
do_something(ch);
A string is an InputRange
. An InputRange
implements three things:
foreach "understands" how to work with ranges, so it "just works".
But I don't speak Go, so I'm not entirely sure we're speaking the same language.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With