Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does the C++ standard handle file seeking the way it does?

C++ uses the streamoff type to represent an offset within a (file) stream and is defined as follows in [stream.types]:

using streamoff = implementation-defined ;

The type streamoff is a synonym for one of the signed basic integral types of sufficient size to represent the maximum possible file size for the operating system. 287)

287) Typically long long.

This makes sense because it allows for seeking within large files (as opposed to using long, which may be only 32 bits wide).

[filebuf.virtuals] defines basic_filebuf's function to seek within a file as follows:

pos_type seekoff(off_type off, ios_base::seekdir way, ios_base::openmode which = ios_base::in | ios_base::out) override;

off_type is equivalent to streamoff, see [iostreams.limits.pos]. However, the standard then goes on to explain the function's effects. I'm irritated by the very last sentence, which requires a call to fseek:

Effects: Let width denote a_codecvt.encoding(). If is_open() == false, or off != 0 && width <= 0, then the positioning operation fails. Otherwise, if way != basic_ios::cur or off != 0, and if the last operation was output, then update the output sequence and write any unshift sequence. Next, seek to the new position: if width > 0, call fseek(file, width * off, whence), otherwise call fseek(file, 0, whence).

fseek accepts a long parameter. If off_type and streamoff are defined as long long (as suggested by the standard), this could lead to a down conversion to long when calling fseek(file, width * off, whence) (leading to potentially hard to diagnose bugs). This calls into question the whole rationale for introducing the streamoff type in the first place.

Is this intentional or a defect in the standard?

like image 790
jceed2 Avatar asked Dec 13 '19 19:12

jceed2


People also ask

What is a file stream in C?

A stream is a logical entity that represents a file or device, that can accept input or output. All input and output functions in standard C, operate on data streams. Streams can be divided into text, streams and binary streams.

Which function makes the input file pointer to a given file position so that the next input will start from that location?

The fseek() function changes the current file position associated with stream to a new location within the file. The next operation on the stream takes place at the new location. On a stream opened for update, the next operation can be either a reading or a writing operation.

Which function links a stream to a file?

Having created a stream, we can connect it to a file using the member function "open(...)".


1 Answers

I think that the conclusion that you're drawing from this, that there is a mismatch between C++ streams and fseek that will lead to runtime bugs, is incorrect. The situation seems to be:

  1. On systems where long is 64 bits, streamoff is defined as long, and the seekoff function invokes fseek.

  2. On systems where long is 32 bits but the OS supports 64-bit file offsets, streamoff is defined as long long and seekoff invokes a function called either fseeko or fseeko64 that accepts a 64-bit offset.

Here's s snippet from the definition of seekoff on my Linux system:

#ifdef _GLIBCXX_USE_LFS
    if (!fseeko64(_M_file, __off, __whence))
      __ret = std::streampos(ftello64(_M_file));
#else
    if (!fseek(_M_file, __off, __whence))
      __ret = std::streampos(std::ftell(_M_file));
#endif

LFS stands for Large File Support.

Conclusion: While the standard suggests a definition for streamoff that ostensibly conflicts with the requirement that seekoff invoke fseek, library designers understand that they must call the variant of fseek that accepts the full range of offsets that the OS supports.

like image 92
Willis Blackburn Avatar answered Oct 12 '22 18:10

Willis Blackburn