I know ANSI C defines fopen, fwrite, fread, fclose to modify a file's content. However, when it comes to truncating a file, we have to turn to OS specific function, e.g, <code>truncate()</code> on Linux, <code>_chsize_s_()</code> on Windows. But before we can call those OS specific functions, we have to obtain the file-handle from FILE pointer, by calling <code>fileno</code>, also an non-ANSI-C one. My question is: Is it reliable to continue using <code>FILE*</code> after truncating the file? I mean, ANSI C <code>FILE</code> layer has its own buffer and does not know the file is truncated from beneath. In case the buffered bytes is beyond the truncated point, will the buffered content be flushed to the file when doing <code>fclose()</code> ? If no guarantee, what is the best practice of using file I/O functions accompanied with truncate operation when write a Windows-Linux portable program? Similar question: When querying file size from a file-handle returned by <code>fileno</code> , is it the accurate size when I later call <code>fclose()</code> -- without further <code>fwrite()</code>? [EDIT 2012-12-11] According to Joshua's suggestion. I conclude that current possible best practice is: Set the stream to unbuffered mode by calling <code>setbuf(stream, NULL);</code> , then <code>truncate()</code> or <code>_chsize_s()</code> can work peacefully with the stream. Anyhow, no official document seems to explicitly confirm this behavior, whether Microsoft CRT or GNU glibc.

<h3>The POSIX way....</h3> <code>ftruncate()</code> is what you're looking for, and it's been in POSIX base specifications since 2001, so it should be in every modern POSIX-compatible system by now. Note that <code>ftruncate()</code> operates on a POSIX file descriptor (despite its potentially misleading name), not a STDIO stream <code>FILE</code> handle. Note also that mixing operations on the STDIO stream and on the underlying OS calls which operate on the file descriptor for the open stream can confuse the internal runtime state of the STDIO library. So, to use <code>ftruncate()</code> safely with STDIO it may be necessary to first flush any STDIO buffers (with <code>fflush()</code>) if your program may have already written to the stream in question. This will avoid STDIO trying to flush the otherwise unwritten buffer to the file after the truncation has been done. You can then use <code>fileno()</code> on the STDIO stream's <code>FILE</code> handle to find the underlying file descriptor for the open STDIO stream, and you would then use that file descriptor with <code>ftruncate()</code>. You might consider putting the call to <code>fileno()</code> right in the parameter list for the <code>ftruncate()</code> call so that you don't keep the file descriptor around and accidentally use it yet other ways which might further confuse the internal state of STDIO. Perhaps like this (say to truncate a file to the current STDIO stream offset): <pre class="prettyprint"><code>/* * NOTE: fflush() is not needed here if there have been no calls to fseek() since * the last fwrite(), assuming it extended the length of the stream -- * ftello() will account for any unwritten buffers */ if (ftruncate(fileno(stdout), ftello(stdout)) == -1) { fprintf(stderr, "%s: ftruncate(stdout) failed: %s\n", argv[0], strerror(errno)); exit(1); } /* fseek() is not necessary here since we truncated at the current offset */ </code></pre> Note also that the POSIX definition of <code>ftruncate()</code> says "The value of the seek pointer shall not be modified by a call to ftruncate()", so this means you may also need to use use <code>fseek()</code> to set the STDIO layer (and thus indirectly the file descriptor) either to the new end of the file, or perhaps back to the beginning of the file, or somewhere still within the boundaries of the file, as desired. (Note that the <code>fseek()</code> should not be necessary if the truncation point is found using <code>ftello()</code>.) You should not have to make the STDIO stream unbuffered if you follow the procedure above, though of course doing so could be an alternative to using <code>fflush()</code> (but not <code>fseek()</code>). <h3>Without POSIX....</h3> If you need to stick to strict ISO Standard C, say C99, then you have no portable way to truncate a file to a given length other than zero (0) length. The latest draft of C11 that I have says this in Section 7.21.3 (paragraph 2): <blockquote> Binary files are not truncated, except as defined in 7.21.5.3. Whether a write on a text stream causes the associated file to be truncated beyond that point is implementation-defined. </blockquote> (and 7.21.5.3 describes the flags to <code>fopen()</code> which allow a file to be truncated to a length of zero) The caveat about text files is there because on silly systems that have both text and binary files (as opposed to just plain POSIX-style content agnostic files) then it is often possible to write a value to the file which will be stored in the file at the position written and which will be treated as an <code>EOF</code> indicator when the file is next read. Other types of systems may have different underlying file I/O interfaces that are not compatible with POSIX while still providing a compatible ISO C STDIO library. In theory if such a system offers something similar to <code>fileno()</code> and <code>ftrunctate()</code> then a similar procedure could be used with them as well, provided that one took the same care to avoid confusing the internal runtime state of the STDIO library. <h3>With regard to querying file size....</h3> You also asked whether the file size found by querying the file descriptor returned by fileno() would be an accurate representation of the file size after a successful call to <code>fclose()</code>, even without any further calls to <code>fwrite()</code>. The answer is: Don't do that! As I mentioned above, the POSIX file descriptor for a file opened as a STDIO stream must be used very carefully if you don't want to confuse the internal runtime state of the STDIO library. We can add here that it is important not to confuse yourself with it either. The most correct way to find the current size of a file opened as a STDIO stream is to seek to the end of it and then ask where the stream pointer is by using only STDIO functions.

Isn't an unbuffered write of zero bytes supposed to truncate the file at that point? See this question for how to set unbuffered: Unbuffered I/O in ANSI C

Is there a guaranteed and safe way to truncate a file from ANSI C FILE pointer?

Tags:

c

file-io

I know ANSI C defines fopen, fwrite, fread, fclose to modify a file's content. However, when it comes to truncating a file, we have to turn to OS specific function, e.g, truncate() on Linux, _chsize_s_() on Windows. But before we can call those OS specific functions, we have to obtain the file-handle from FILE pointer, by calling fileno, also an non-ANSI-C one.

My question is: Is it reliable to continue using FILE* after truncating the file? I mean, ANSI C FILE layer has its own buffer and does not know the file is truncated from beneath. In case the buffered bytes is beyond the truncated point, will the buffered content be flushed to the file when doing fclose() ?

If no guarantee, what is the best practice of using file I/O functions accompanied with truncate operation when write a Windows-Linux portable program?

Similar question: When querying file size from a file-handle returned by fileno , is it the accurate size when I later call fclose() -- without further fwrite()?

[EDIT 2012-12-11]

According to Joshua's suggestion. I conclude that current possible best practice is: Set the stream to unbuffered mode by calling setbuf(stream, NULL); , then truncate() or _chsize_s() can work peacefully with the stream.

Anyhow, no official document seems to explicitly confirm this behavior, whether Microsoft CRT or GNU glibc.

773

asked Dec 07 '12 01:12

Jimm Chen

2 Answers

The POSIX way....

ftruncate() is what you're looking for, and it's been in POSIX base specifications since 2001, so it should be in every modern POSIX-compatible system by now.

Note that ftruncate() operates on a POSIX file descriptor (despite its potentially misleading name), not a STDIO stream FILE handle. Note also that mixing operations on the STDIO stream and on the underlying OS calls which operate on the file descriptor for the open stream can confuse the internal runtime state of the STDIO library.

So, to use ftruncate() safely with STDIO it may be necessary to first flush any STDIO buffers (with fflush()) if your program may have already written to the stream in question. This will avoid STDIO trying to flush the otherwise unwritten buffer to the file after the truncation has been done.

You can then use fileno() on the STDIO stream's FILE handle to find the underlying file descriptor for the open STDIO stream, and you would then use that file descriptor with ftruncate(). You might consider putting the call to fileno() right in the parameter list for the ftruncate() call so that you don't keep the file descriptor around and accidentally use it yet other ways which might further confuse the internal state of STDIO. Perhaps like this (say to truncate a file to the current STDIO stream offset):

/*
 * NOTE: fflush() is not needed here if there have been no calls to fseek() since
 * the last fwrite(), assuming it extended the length of the stream --
 * ftello() will account for any unwritten buffers
 */
if (ftruncate(fileno(stdout), ftello(stdout)) == -1) {
        fprintf(stderr, "%s: ftruncate(stdout) failed: %s\n", argv[0], strerror(errno));
        exit(1);
}
/* fseek() is not necessary here since we truncated at the current offset */

Note also that the POSIX definition of ftruncate() says "The value of the seek pointer shall not be modified by a call to ftruncate()", so this means you may also need to use use fseek() to set the STDIO layer (and thus indirectly the file descriptor) either to the new end of the file, or perhaps back to the beginning of the file, or somewhere still within the boundaries of the file, as desired. (Note that the fseek() should not be necessary if the truncation point is found using ftello().)

You should not have to make the STDIO stream unbuffered if you follow the procedure above, though of course doing so could be an alternative to using fflush() (but not fseek()).

Without POSIX....

If you need to stick to strict ISO Standard C, say C99, then you have no portable way to truncate a file to a given length other than zero (0) length. The latest draft of C11 that I have says this in Section 7.21.3 (paragraph 2):

Binary files are not truncated, except as defined in 7.21.5.3. Whether a write on a text stream causes the associated file to be truncated beyond that point is implementation-defined.

(and 7.21.5.3 describes the flags to fopen() which allow a file to be truncated to a length of zero)

The caveat about text files is there because on silly systems that have both text and binary files (as opposed to just plain POSIX-style content agnostic files) then it is often possible to write a value to the file which will be stored in the file at the position written and which will be treated as an EOF indicator when the file is next read.

Other types of systems may have different underlying file I/O interfaces that are not compatible with POSIX while still providing a compatible ISO C STDIO library. In theory if such a system offers something similar to fileno() and ftrunctate() then a similar procedure could be used with them as well, provided that one took the same care to avoid confusing the internal runtime state of the STDIO library.

With regard to querying file size....

You also asked whether the file size found by querying the file descriptor returned by fileno() would be an accurate representation of the file size after a successful call to fclose(), even without any further calls to fwrite().

The answer is: Don't do that!

As I mentioned above, the POSIX file descriptor for a file opened as a STDIO stream must be used very carefully if you don't want to confuse the internal runtime state of the STDIO library. We can add here that it is important not to confuse yourself with it either.

The most correct way to find the current size of a file opened as a STDIO stream is to seek to the end of it and then ask where the stream pointer is by using only STDIO functions.

answered Sep 21 '22 15:09

Greg A. Woods

Isn't an unbuffered write of zero bytes supposed to truncate the file at that point?

See this question for how to set unbuffered: Unbuffered I/O in ANSI C

answered Sep 20 '22 15:09

Joshua

Related questions
                            
                                How to update OpenSSL on mac?
                            
                                Stacks are executable even with `noexecstack`
                            
                                Can I make GCC warn on passing too-wide types to functions?
                            
                                How can I decode HTML entities in C++?
                            
                                Are there any lightweight alternatives to gSOAP?
                            
                                Why using a typedef *after* struct definition?
                            
                                Making UI for console application [closed]
                            
                                More linked lists in C
                            
                                Using ptrace to track all execve() calls across children
                            
                                How can barriers be destroyable as soon as pthread_barrier_wait returns?
                            
                                Is there a way to flag the use of non-reentrant C library calls?
                            
                                Where is stdarg.h?
                            
                                fail compile if required flags aren't present
                            
                                User mode USB isochronous transfer from device-to-host
                            
                                How to fork() n child processes correctly in C?
                            
                                Graceful Shutdown Server Socket in Linux
                            
                                Portable serialisation of IEEE754 floating-point values
                            
                                Is there command-line tool to extract typedef, structure, enumeration, variable, function from a C or C++ file?
                            
                                USB API for Windows [closed]
                            
                                Converting CGPoints from one view to another relatively for an animation

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With