Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

the patterns used in iterators

I am familiar with the usage of C++ STL iterators, e.g.

for(map<pair<int,int>>::iterator it=m.begin(); it!=m.end(); ++it)
  int  a = it->first;
  int b = it->second;

But I don't know the inner details in it. Could some explain to me? Either in C++, Java, C# or Python.

like image 885
user253951 Avatar asked Dec 13 '22 00:12

user253951


2 Answers

In C++ iterators into a container resemble pointers into an array (I am assuming that you are familiar with pointers). There are different flavors of iterators, but at the end they are just a way of referring to elements within the container (through the dereference operators * and ->) and transversing the elements in the container.

The important part is not the implementation, but rather the concept. You do not need to know how an iterator into a list or a vector are implemented (or how they differ in many cases), only what operations it provides. Iterators into different containers will have different implementations (for a list it will follow some next pointer in the node, for a map it will follow either the right child or parent pointer of the balanced tree. In fact iterators into the same container can be implemented in different ways (and some compilers do have more than one implementation for any given container) depending on compilation flags or mode. But still, the important part is that you really don't care how they are, just what they allow you to do.

As a simple example, in g++ STL implementation std::vector contains three pointers, something like:

//...
class vector {
   T * _b;  // beginning of allocated memory
   T * _e;  // one past the last inserted element 
   T * _c;  // one past the end of the allocated memory
//...
}

such that size() = (_e - _b) / sizeof(T) and capacity = (_c - _b) / sizeof(T). With this implementation of vector, you can use a raw pointer as an iterator:

//...
class vector {
public:
   typedef T* iterator;
   T* begin() { return _b; }
   T* end() { return _e; }
//...
}

but you can also build more complex (slower but safer) iterator implementations, like checked iterators that will trigger an assertion if the iterator has been invalidated (this code is oversimplified just for sample purposes):

template <typename T>
class checked_iterator {
public:
   checked_iterator( std::vector<T> & v, std::size_t e )
      : _check_begin(v._b), _vector(v), _pos( v._b + e )
   {}
   T& operator*() {
      assert( _check_begin == _vector._b && "Dereferencing an invalidated iterator");
      return *_pos; 
   }
   // ...
private:
   T * _pos;
   T const * const _check_begin;
   std::vector<T>& _vector;
};

This iterator implementation would detect dereferencing an invalid iterator (only in the case of a whole vector relocation, but by storing more data it could do a full check) and will abort the execution of an incorrect program while still in development. From the user point of view it would be a plain RandomAccessIterator (it should be, that is a requirement for vector iterators) but behind the scenes it will provide a mechanism to identify otherwise hard to detect errors.

This is the approach in the VS compilers: while in debug mode (and depending on compiler flags) it will use slow safe iterators that will help detecting access through invalidated iterators (for a vector, an iterator is invalidated whenever an element is added into the container). At the same time, changing the compiler flags you can get the plain raw pointer implementation that will be much faster for production systems, but it would be much harder to debug invalid iterator usage.

In Java and C# they are actually objects that implement a couple of simple operations (in Java hasNext(), next() and remove()) that allow for a transversal of a whole container hidding how the container is implemented. They are quite similar in that the intention is encapsulating the operations performed on a particular container from user code.

One important difference is that in both cases you can use them to iterate over the whole container, but in c++ they are comparable and you can iterate over any two iterators into the same container. For example, in a map containing your city phone book you could use operations to get an iterator into the first name that begins with a c, and another search to get the first element whose name begins with 'd' (assuming name ordering) and you could use any STL (or your own) algorithm with those two iterator to perform the operation only on that subset of people.

like image 145
David Rodríguez - dribeas Avatar answered Dec 23 '22 09:12

David Rodríguez - dribeas


iterator is just an abstract notion for which it is defined that dereferenceing it, using the * operator will give you a certain element from the sequence associated with the iterator and incrementing it will give you an iterator associated with the next element in some sequence. This means that the concrete iterator is defined by the sequence and is not a separately defined class, i.e. why you need to use type map<pair<int,int> >::iterator and not just iterator. The iterator type for each STL sequence has it's own implementation for which the ++ operator is overloaded to provide an iterator pointing to the next element inthe sequence.

This means that a simple pointer into a character array is also an iterator, since dereferencing the pointer will give you the object associated with the iterator (pointer) and incrementing it will give you a new iterator (pointer) associated with the next element in the sequence.

A partial example for a doubly linked list, (note: untested, just wrote this up, may need to add some friend clauses and stuff):

class doubly_linked_list {
  class node {
    node* prev;
    node* next;
    int data;
    node(const int data, node* prev, node* next) : data(data), prev(prev), next(next) {}
  };
  static const node nil; // = node(0, 0, 0)
  node* head;
  node* tail;
public:
  class iterator {
    node* current;
    iterator(node* n) : current(n) {}
  public:
    int& operator*() const {
      return current->obj;
    }
    iterator& operator++() {
      current = current->next;
      return *this;
    }
    iterator& operator--() {
      current = current->prev;
      return *this;
    }
  };
  double_linked_list() : head(nil), tail(nil) {}
  iterator begin() const {
    return iterator(head);
  }
  iterator end() const {
    return iterator(tail);
  }
};

And to illustrate how different data structures will have different iterators a vector example (just as untested)

class vector {
  int* data;
  size_t len;
public:
  typedef int* iterator;
  vector(const size_t len) : len(len) {
    data = new int[len];
  }
  int& operator[](int i) const {
    return data[i];
  }
  iterator begin() const {
    return data;
  }
  iterator end() const {
    return data + len;
  }
};
like image 27
wich Avatar answered Dec 23 '22 08:12

wich