This code loops forever:
#include <iostream>
#include <fstream>
#include <sstream>
int main(int argc, char *argv[])
{
std::ifstream f(argv[1]);
std::ostringstream ostr;
while(f && !f.eof())
{
char b[5000];
std::size_t read = f.readsome(b, sizeof b);
std::cerr << "Read: " << read << " bytes" << std::endl;
ostr.write(b, read);
}
}
It's because readsome
is never setting eofbit
.
cplusplus.com says:
Errors are signaled by modifying the internal state flags:
eofbit
The get pointer is at the end of the stream buffer's internal input array when the function is called, meaning that there are no positions to be read in the internal buffer (which may or not be the end of the input sequence). This happens whenrdbuf()->in_avail()
would return-1
before the first character is extracted.
failbit
The stream was at the end of the source of characters before the function was called.
badbit
An error other than the above happened.
Almost the same, the standard says:
[C++11: 27.7.2.3]:
streamsize readsome(char_type* s, streamsize n);
32. Effects: Behaves as an unformatted input function (as described in 27.7.2.3, paragraph 1). After constructing a sentry object, if
!good()
callssetstate(failbit)
which may throw an exception, and return. Otherwise extracts characters and stores them into successive locations of an array whose first element is designated bys
. Ifrdbuf()->in_avail() == -1
, callssetstate(eofbit)
(which may throwios_base::failure
(27.5.5.4)), and extracts no characters;
- If
rdbuf()->in_avail() == 0
, extracts no characters- If
rdbuf()->in_avail() > 0
, extractsmin(rdbuf()->in_avail(),n))
.33. Returns: The number of characters extracted.
That the in_avail() == 0
condition is a no-op implies that ifstream::readsome
itself is a no-op if the stream buffer is empty, but the in_avail() == -1
condition implies that it will set eofbit
when some other operation has led to in_avail() == -1
.
This seems like an inconsistency, even despite the "some" nature of readsome
.
So what are the semantics of readsome
and eof
? Have I interpreted them correctly? Are they an example of poor design in the streams library?
(Stolen from the [IMO] invalid libstdc++ bug 52169.)
I think this is a customization point, not really used by the default stream implementations.
in_avail()
returns the number of chars it can see in the internal buffer, if any. Otherwise it calls showmanyc()
to try to detect if chars are known to be available elsewhere, so a buffer fill request is guaranteed to succeed.
In turn, showmanyc()
will return the number of chars it knows about, if any, or -1 if it knows that a read will fail, or 0 if it doesn't have a clue.
The default implementation (basic_streambuf
) always returns 0, so that is what you get unless you have a stream with some other streambuf overriding showmanyc
.
Your loop is essentially read-as-many-chars-as-you-know-is-safe, and it gets stuck when that is zero (meaning "not sure").
I don't think that readsome() is meant for what you're trying to do (read from a file on disk)... from cplusplus.com:
The function is intended to be used to read binary data from certain types of asynchronic sources that may wait for more characters, since it stops reading when the local buffer exhausts, avoiding potential unexpected delays.
So it sounds like readsome() is intended for streams from a network socket or something like that, and you probably want to just use read().
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With