Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

bidirectional iterator over file/ifstream

I need an input file stream which would have a bidirectional iterator/adapter.

Unfortunately std::ifstream (and similar) can be used only with std::istream_iterator which is a kind of forward iterator which cannot go backwards. (or am I mistaken here?)

I could simply load the whole file to memory and then use a much more powerful random-access iterator over the array; however I would like to avoid that, and read only as much as I really need. It may happen that I really need only a small portion of a file.

I could somehow do it manually using C stdio.h functions, but that will be painful. I would basically need to implement a bidirectional iterator, with all its specification in mind, by hand.

I am considering looking into boost iostream library, but the manual is somewhat overwhelming, I was hoping someone could give me a hand to achieve this particular goal? Or maybe there is another already existing library to do exactly what I need?

I need the iterator for the boost xpressive library to parse my files, which expects that the iterator can be incremented as well as decremented. I would be fine if the file I am reading is buffered, although this is not a requirement.

Any ideas? Thank you!

like image 251
CygnusX1 Avatar asked Jan 26 '12 22:01

CygnusX1


2 Answers

If I were to traverse a file in the wrong direction, I would start off questioning my requirements. This seems to be a contrived way to do things and most likely something got messed up dramatically somewhere.

Once I confirmed that this is indeed the requirement I would realize that we are definitely talking files here, rather than e.g. a named pipe or a socket. That is, it would be possible memory map at least parts of the file. I would use this to create an iterator which walks the memory. Since obviously an iterator is needed, there is no need to involve streams. If streams were needed too, I would still memory map the file and reverse buffers from the back in a custom stream buffer.

When actually reading from the start and just needing to be able to move backwards when necessary, it may be simpler than this: keeping a buffer of the already read data and expending it when moving off the end plus possibly reading the entire file if the end iterator is used to move backwards should address this. Here is code which certainly can read a file forward and backward but isn't thoroughly tested:

#include <iostream>
#include <fstream>
#include <algorithm>
#include <iterator>
#include <limits>
#include <vector>

class bidirectional_stream
{
public:
    class                                         iterator;
    typedef iterator                              const_iterator;
    typedef std::reverse_iterator<iterator>       reverse_iterator;
    typedef std::reverse_iterator<const_iterator> const_reverse_iterator;

    bidirectional_stream(std::istream& in):
        in_(in)
    {
    }
    iterator         begin();
    iterator         end();
    reverse_iterator rbegin();
    reverse_iterator rend();

    bool expand()
    {
        char buffer[1024];
        this->in_.read(buffer, sizeof(buffer));
        this->buffer_.insert(this->buffer_.end(), buffer, buffer + this->in_.gcount());
        return 0 < this->in_.gcount();
    }
    long read_all()
    {
        this->buffer_.insert(this->buffer_.end(),
                             std::istreambuf_iterator<char>(this->in_),
                             std::istreambuf_iterator<char>());
        return this->buffer_.size();
    }
    char get(long index) { return this->buffer_[index]; }
    long current_size() const { return this->buffer_.size(); }
private:
    std::istream&     in_;
    std::vector<char> buffer_;
};

class bidirectional_stream::iterator
{
public:
    typedef char                            value_type;
    typedef char const*                     pointer;
    typedef char const&                     reference;
    typedef long                            difference_type;
    typedef std::bidirectional_iterator_tag iterator_category;

    iterator(bidirectional_stream* context, size_t pos):
        context_(context),
        pos_(pos)
    {
    }

    bool operator== (iterator const& other) const
    {
        return this->pos_ == other.pos_
            || (this->pos_ == this->context_->current_size()
                && !this->context_->expand()
                && other.pos_ == std::numeric_limits<long>::max());
    }
    bool operator!= (iterator const& other) const { return !(*this == other); }
    char      operator*() const { return this->context_->get(this->pos_); }
    iterator& operator++()    { ++this->pos_; return *this; }
    iterator  operator++(int) { iterator rc(*this); this->operator++(); return rc; }
    iterator& operator--()
    {
        if (this->pos_ == std::numeric_limits<long>::max())
        {
            this->pos_ = this->context_->read_all();
        }
        --this->pos_;
        return *this;
    }
    iterator  operator--(int) { iterator rc(*this); this->operator--(); return rc; }

private:
    bidirectional_stream* context_;
    long                  pos_;
};

bidirectional_stream::iterator bidirectional_stream::begin()
{
    return iterator(this, 0);
}
bidirectional_stream::iterator bidirectional_stream::end()
{
    return iterator(this, std::numeric_limits<long>::max());
}

bidirectional_stream::reverse_iterator bidirectional_stream::rbegin()
{
    return reverse_iterator(this->end());
}
bidirectional_stream::reverse_iterator bidirectional_stream::rend()
{
    return reverse_iterator(this->begin());
}

Just create a bidirectional_stream with the stream you want to read as an argument and then use the begin() and end() methods to actually access it.

like image 184
Dietmar Kühl Avatar answered Nov 15 '22 20:11

Dietmar Kühl


Since you're already using boost, take a look at boost::iostreams::mapped_file_source http://www.boost.org/doc/libs/release/libs/iostreams/doc/classes/mapped_file.html#mapped_file_source

You can use file.data() as the begin iterator and file.data() + file.size() as the end iterator.

like image 24
Cubbi Avatar answered Nov 15 '22 21:11

Cubbi