Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Distinguishing between failure and end of file in read loop

The idiomatic loop to read from an istream is

while (thestream >> value)
{
  // do something with value
}

Now this loop has one problem: It will not distinguish if the loop terminated due to end of file, or due to an error. For example, take the following test program:

#include <iostream>
#include <sstream>

void readbools(std::istream& is)
{
  bool b;
  while (is >> b)
  {
    std::cout << (b ? "T" : "F");
  }
  std::cout << " - " << is.good() << is.eof() << is.fail() << is.bad() << "\n";
}

void testread(std::string s)
{
  std::istringstream is(s);
  is >> std::boolalpha;
  readbools(is);
}

int main()
{
  testread("true false");
  testread("true false tr");
}

The first call to testread contains two valid bools, and therefore is not an error. The second call ends with a third, incomplete bool, and therefore is an error. Nevertheless, the behaviour of both is the same. In the first case, reading the boolean value fails because there is none, while in the second case it fails because it is incomplete, and in both cases EOF is hit. Indeed, the program above outputs twice the same line:

TF - 0110
TF - 0110

To solve this problem, I thought of the following solution:

while (thestream >> std::ws && !thestream.eof() && thestream >> value)
{
  // do something with value
}

The idea is to detect regular EOF before actually trying to extract the value. Because there might be whitespace at the end of the file (which would not be an error, but cause read of the last item to not hit EOF), I first discard any whitespace (which cannot fail) and then test for EOF. Only if I'm not at the end of file, I try to read the value.

For my example program, it indeed seems to work, and I get

TF - 0100
TF - 0110

So in the first case (correct input), fail() returns false.

Now my question: Is this solution guaranteed to work, or was I just (un-)lucky that it happened to give the desired result? Also: Is there a simpler (or, if my solution is wrong, a correct) way to get the desired result?

like image 750
celtschk Avatar asked Nov 11 '11 22:11

celtschk


1 Answers

It is very easy to differentiate between EOF and other errors, as long as you don't configure the stream to use exceptions.

Simply check stream.eof() at the end.

Before that only check for failure/non-failure, e.g. stream.fail() or !stream. Note that good is not the opposite of fail. So in general never even look at the good, only at the fail.


Edit:

Some example code, namely your example modified to distinguish an ungood bool specification in the data:

#include <iostream>
#include <sstream>
#include <string>
#include <stdexcept>
using namespace std;

bool throwX( string const& s )  { throw runtime_error( s ); }
bool hopefully( bool v )        { return v; }

bool boolFrom( string const& s )
{
    istringstream stream( s );
    (stream >> boolalpha)
        || throwX( "boolFrom: failed to set boolalpha mode." );

    bool result;
    (stream >> result)
        || throwX( "boolFrom: failed to extract 'bool' value." );
        
    char c;  stream >> c;
    hopefully( stream.eof() )
        || throwX( "boolFrom: found extra characters at end." );
    
    return result;
}

void readbools( istream& is )
{
    string word;
    while( is >> word )
    {
        try
        {
            bool const b = boolFrom( word );
            cout << (b ? "T" : "F") << endl;
        }
        catch( exception const& x )
        {
            cerr << "!" << x.what() << endl;
        }
    }
    cout << "- " << is.good() << is.eof() << is.fail() << is.bad() << "\n";
}

void testread( string const& s )
{
    istringstream is( s );
    readbools( is );
}

int main()
{
  cout << string( 60, '-' ) << endl;
  testread( "true false" );

  cout << string( 60, '-' ) << endl;
  testread( "true false tr" );

  cout << string( 60, '-' ) << endl;
  testread( "true false truex" );
}

Example result:

------------------------------------------------------------
T
F
- 0110
------------------------------------------------------------
T
F
!boolFrom: failed to extract 'bool' value.
- 0110
------------------------------------------------------------
T
F
!boolFrom: found extra characters at end.
- 0110

Edit 2: in the posted code and results, added example of using eof() checking, which I forgot.


Edit 3: The following corresponding example uses the OP’s proposed skip-whitespace-before-reading solution:

#include <iostream>
#include <sstream>
#include <string>
using namespace std;

void readbools( istream& is )
{
    bool b;
    while( is >> ws && !is.eof() && is >> b )       // <- Proposed scheme.
    {
        cout << (b ? "T" : "F") << endl;
    }
    if( is.fail() )
    {
        cerr << "!readbools: failed to extract 'bool' value." << endl;
    }
    cout << "- " << is.good() << is.eof() << is.fail() << is.bad() << "\n";
}

void testread( string const& s )
{
    istringstream is( s );
    is >> boolalpha;
    readbools( is );
}

int main()
{
  cout << string( 60, '-' ) << endl;
  testread( "true false" );

  cout << string( 60, '-' ) << endl;
  testread( "true false tr" );

  cout << string( 60, '-' ) << endl;
  testread( "true false truex" );
}

Example result:

------------------------------------------------------------
T
F
- 0100
------------------------------------------------------------
T
F
!readbools: failed to extract 'bool' value.
- 0110
------------------------------------------------------------
T
F
T
!readbools: failed to extract 'bool' value.
- 0010

The main difference is that this approach produces 3 successfully read values in the third case, even though the third value is incorrectly specified (as "truex").

I.e. it fails to recognize an incorrect specification as such.

Of course, my ability to write Code That Does Not Work™ is no proof that it can not work. But I am fairly good at coding up things, and I could not see any way to detect the "truex" as incorrect, with this approach (while it was easy to do with the read-words exception based approach). So at least for me, the read-words exception based approach is simpler, in the sense that it is easy to make it behave correctly.

like image 83
Cheers and hth. - Alf Avatar answered Sep 22 '22 05:09

Cheers and hth. - Alf