Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you use ranges in D?

Tags:

range

d

phobos

Whenever I try to use ranges in D, I fail miserably.

What is the proper way to use ranges in D? (See inline comments for my confusion.)

void print(R)(/* ref? auto ref? neither? */ R r)
{
    foreach (x; r)
    {
        writeln(x);
    }

    // Million $$$ question:
    //
    // Will I get back the same things as last time?
    // Do I have to check for this every time?

    foreach (x; r)
    {
        writeln(x);
    }
}

void test2(alias F, R)(/* ref/auto ref? */ R items)
{
    // Will it consume items?
    // _Should_ it consume items?
    // Will the caller be affected? How do I know?
    // Am I supposed to?
    F(items);
}
like image 276
user541686 Avatar asked Jun 25 '12 13:06

user541686


4 Answers

You should probably read this tutorial on ranges if you haven't.

When a range will and won't be consumed depends on its type. If it's an input range and not a forward range (e.g if it's an input stream of some kind - std.stdio.byLine would be one example of this), then iterating over it in any way shape or form will consume it.

//Will consume
auto result = find(inRange, needle);

//Will consume
foreach(e; inRange) {}

If it's a forward range and it's a reference type, then it will be consumed whenever you iterate over it, but you can call save to get a copy of it, and consuming the copy won't consume the original (nor will consuming the original consume the copy).

//Will consume
auto result = find(refRange, needle);

//Will consume
foreach(e; refRange) {}

//Won't consume
auto result = find(refRange.save, needle);

//Won't consume
foreach(e; refRange.save) {}

Where things get more interesting is forward ranges which are value types (or arrays). They act the same as any forward range with regards to save, but they differ in that simply passing them to a function or using them in a foreach implicitly saves them.

//Won't consume
auto result = find(valRange, needle);

//Won't consume
foreach(e; valRange) {}

//Won't consume
auto result = find(valRange.save, needle);

//Won't consume
foreach(e; valRange.save) {}

So, if you're dealing with an input range which isn't a forward range, it will be consumed regardless. And if you're dealing with a forward range, you need to call save if you want want to guarantee that it isn't consumed - otherwise whether it's consumed or not depends on its type.

With regards to ref, if you declare a range-based function to take its argument by ref, then it won't be copied, so it won't matter whether the range passed in is a reference type or not, but it does mean that you can't pass an rvalue, which would be really annoying, so you probably shouldn't use ref on a range parameter unless you actually need it to always mutate the original (e.g. std.range.popFrontN takes a ref because it explicitly mutates the original rather than potentially operating on a copy).

As for calling range-based functions with forward ranges, value type ranges are most likely to work properly, since far too often, code is written and tested with value type ranges and isn't always properly tested with reference types. Unfortunately, this includes Phobos' functions (though that will be fixed; it just hasn't been properly tested for in all cases yet - if you run into any cases where a Phobos function doesn't work properly with a reference type forward range, please report it). So, reference type forward ranges don't always work as they should.

like image 179
Jonathan M Davis Avatar answered Nov 02 '22 01:11

Jonathan M Davis


Sorry, I can't fit this into a comment :D. Consider if Range were defined this way:

interface Range {
    void doForeach(void delegate() myDel);
}

And your function looked like this:

void myFunc(Range r) {
    doForeach(() {
        //blah
    });
}

You wouldn't expect anything strange to happen when you reassigned r, nor would you expect to be able to modify the caller's Range. I think the problem is that you are expecting your template function to be able to account for all of the variation in range types, while still taking advantage of the specialization. That doesn't work. You can apply a contract to the template to take advantage of the specialization, or use only the general functionality. Does this help at all?

Edit (what we've been talking about in comments):

void funcThatDoesntRuinYourRanges(R)(R r)
if (isForwardRange(r)) {
    //do some stuff
}

Edit 2 std.range It looks like isForwardRange simply checks whether save is defined, and save is just a primitive that makes a sort of un-linked copy of the range. The docs specify that save is not defined for e.g. files and sockets.

like image 24
Tim Avatar answered Nov 02 '22 02:11

Tim


The short of it; ranges are consumed. This is what you should expect and plan for.

The ref on the foreach plays no role in this, it only relates to the value returned by the range.

The long; ranges are consumed, but may get copied. You'll need to look at the documentation to decide what will happen. Value types get copied and thus a range may not be modified when passed to a function, but you can not rely on if the range comes as a struct as the data stream my be a reference, e.g. FILE. And of course a ref function parameter will add to the confusion.

like image 1
he_the_great Avatar answered Nov 02 '22 01:11

he_the_great


Say your print function looks like this:

void print(R)(R r) {
  foreach (x; r) {
    writeln(x);
  }
}

Here, r is passed into the function using reference semantics, using the generic type R: so you don't need ref here (and auto will give a compilation error). Otherwise, this will print the contents of r, item-by-item. (I seem to remember there being a way to constrain the generic type to that of a range, because ranges have certain properties, but I forget the details!)

Anyway:

auto myRange = [1, 2, 3];
print(myRange);
print(myRange);

...will output:

1
2
3
1
2
3

If you change your function to (presuming x++ makes sense for your range):

void print(R)(R r) {
  foreach (x; r) {
    x++;
    writeln(x);
  }
}

...then each element will be increased before being printed, but this is using copy semantics. That is, the original values in myRange won't be changed, so the output will be:

2
3
4
2
3
4

If, however, you change your function to:

void print(R)(R r) {
  foreach (ref x; r) {
    x++;
    writeln(x);
  }
}

...then the x is reverted to reference semantics, which refer to the original elements of myRange. Hence the output will now be:

2
3
4
3
4
5
like image 1
Xophmeister Avatar answered Nov 02 '22 02:11

Xophmeister