Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Turning the next(), hasNext() iterator interface into begin(), end() interface

I have to use an external library I cannot change. This library among others can tokenize specially formatted files by its internal logic. The tokenizer offers an iterator interface for accessing tokens, which looks like the following simplified example:

class Tokenizer {
public:
    /* ... */
    Token token() const; // returns the current token
    Token next() const; // returns the next token
    bool hasNext() const; // returns 'true' if there are more tokens
    /* ... */
};

I would like to implement an iterator wrapper for the presented Tokenizer which allows the use of standard algorithms library (std::copy_if, std::count, etc.). To be more specific, suffice if the iterator wrapper meets the requirements of input iterator.

My current trial looks like the following:

class TokenIterator {
public:
    using iterator_category = std::input_iterator_tag;
    using value_type = Token;
    using difference_type = std::ptrdiff_t;
    using pointer = const value_type*;
    using reference = const value_type&;

    explicit TokenIterator(Tokenizer& tokenizer) :
            tokenizer(tokenizer) {
    }
    TokenIterator& operator++() {
        tokenizer.next();
        return *this;
    }
    value_type operator*() {
        return tokenizer.token();
    }

private:
    Tokenizer& tokenizer;
};

I got stuck with implementation of functions like begin and end, equality comparator, etc. So, my questions are:

  • How can I construct a TokenIterator instance which indicates the end of the token sequence (i.e. hasNext() == false) and how can I compare it to another TokenIterator instance to decide whether they are same?
  • Is it a good approach if I return a value from the overload of operator*() instead of a reference?
like image 867
Akira Avatar asked Nov 18 '25 16:11

Akira


1 Answers

First, I recommend taking a close look at http://www.boost.org/doc/libs/1_65_1/libs/iterator/doc/iterator_facade.html

I find that it vastly reduces the amount of boilerplate needed for code like this.

Then, you have to decide how you wish to represent an iterator that has reached the "end". One approach is to make a default constructed iterator be the "end" iterator. It contains no object and you must not increment or dereference it. The "begin" iterator is then a non-default-constructed iterator. It has an object and you can dereference it. Incrementing this iterator simply checks hasNext(). If true, set the contained object to next(). If false, clear the contained object and make this iterator look like like a default constructed one.

There shouln't be any problems returning by value from operator*. Even if you assign to a reference, lifetime extension will keep the value around until the reference goes out of scope. That said, any code that assumes such references remain valid over multiple iterations WILL break, so stick to simple for (auto val : tokens) or for (auto& val : tokens).

like image 186
Filipp Avatar answered Nov 21 '25 06:11

Filipp



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!