Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a guaranteed and safe way to truncate a file from ANSI C FILE pointer?

Tags:

c

file-io

I know ANSI C defines fopen, fwrite, fread, fclose to modify a file's content. However, when it comes to truncating a file, we have to turn to OS specific function, e.g, truncate() on Linux, _chsize_s_() on Windows. But before we can call those OS specific functions, we have to obtain the file-handle from FILE pointer, by calling fileno, also an non-ANSI-C one.

My question is: Is it reliable to continue using FILE* after truncating the file? I mean, ANSI C FILE layer has its own buffer and does not know the file is truncated from beneath. In case the buffered bytes is beyond the truncated point, will the buffered content be flushed to the file when doing fclose() ?

If no guarantee, what is the best practice of using file I/O functions accompanied with truncate operation when write a Windows-Linux portable program?

Similar question: When querying file size from a file-handle returned by fileno , is it the accurate size when I later call fclose() -- without further fwrite()?

[EDIT 2012-12-11]

According to Joshua's suggestion. I conclude that current possible best practice is: Set the stream to unbuffered mode by calling setbuf(stream, NULL); , then truncate() or _chsize_s() can work peacefully with the stream.

Anyhow, no official document seems to explicitly confirm this behavior, whether Microsoft CRT or GNU glibc.

like image 773
Jimm Chen Avatar asked Dec 07 '12 01:12

Jimm Chen


People also ask

How do I truncate a file in Shell?

Using Truncate Command: By running the bash command, the output will be as same as in the image. After that, we will use the “truncate” command followed by the “-s” keyword. This keyword “-s” is followed by the number “0”, which means that this file will be truncated to zero contents.

Which of the following mode is used to truncate shrink the entire file?

To truncate the file, you can open the file in append mode or write mode.

What does truncating a file mean in C?

1 year ago. by Aqsa Yasin. As clear from the name “truncate”, it means removing, clearing up, or reducing size. There are many ways available to truncate a file while you are working on the Linux operating system.


2 Answers

The POSIX way....

ftruncate() is what you're looking for, and it's been in POSIX base specifications since 2001, so it should be in every modern POSIX-compatible system by now.

Note that ftruncate() operates on a POSIX file descriptor (despite its potentially misleading name), not a STDIO stream FILE handle. Note also that mixing operations on the STDIO stream and on the underlying OS calls which operate on the file descriptor for the open stream can confuse the internal runtime state of the STDIO library.

So, to use ftruncate() safely with STDIO it may be necessary to first flush any STDIO buffers (with fflush()) if your program may have already written to the stream in question. This will avoid STDIO trying to flush the otherwise unwritten buffer to the file after the truncation has been done.

You can then use fileno() on the STDIO stream's FILE handle to find the underlying file descriptor for the open STDIO stream, and you would then use that file descriptor with ftruncate(). You might consider putting the call to fileno() right in the parameter list for the ftruncate() call so that you don't keep the file descriptor around and accidentally use it yet other ways which might further confuse the internal state of STDIO. Perhaps like this (say to truncate a file to the current STDIO stream offset):

/*
 * NOTE: fflush() is not needed here if there have been no calls to fseek() since
 * the last fwrite(), assuming it extended the length of the stream --
 * ftello() will account for any unwritten buffers
 */
if (ftruncate(fileno(stdout), ftello(stdout)) == -1) {
        fprintf(stderr, "%s: ftruncate(stdout) failed: %s\n", argv[0], strerror(errno));
        exit(1);
}
/* fseek() is not necessary here since we truncated at the current offset */

Note also that the POSIX definition of ftruncate() says "The value of the seek pointer shall not be modified by a call to ftruncate()", so this means you may also need to use use fseek() to set the STDIO layer (and thus indirectly the file descriptor) either to the new end of the file, or perhaps back to the beginning of the file, or somewhere still within the boundaries of the file, as desired. (Note that the fseek() should not be necessary if the truncation point is found using ftello().)

You should not have to make the STDIO stream unbuffered if you follow the procedure above, though of course doing so could be an alternative to using fflush() (but not fseek()).

Without POSIX....

If you need to stick to strict ISO Standard C, say C99, then you have no portable way to truncate a file to a given length other than zero (0) length. The latest draft of C11 that I have says this in Section 7.21.3 (paragraph 2):

Binary files are not truncated, except as defined in 7.21.5.3. Whether a write on a text stream causes the associated file to be truncated beyond that point is implementation-defined.

(and 7.21.5.3 describes the flags to fopen() which allow a file to be truncated to a length of zero)

The caveat about text files is there because on silly systems that have both text and binary files (as opposed to just plain POSIX-style content agnostic files) then it is often possible to write a value to the file which will be stored in the file at the position written and which will be treated as an EOF indicator when the file is next read.

Other types of systems may have different underlying file I/O interfaces that are not compatible with POSIX while still providing a compatible ISO C STDIO library. In theory if such a system offers something similar to fileno() and ftrunctate() then a similar procedure could be used with them as well, provided that one took the same care to avoid confusing the internal runtime state of the STDIO library.

With regard to querying file size....

You also asked whether the file size found by querying the file descriptor returned by fileno() would be an accurate representation of the file size after a successful call to fclose(), even without any further calls to fwrite().

The answer is: Don't do that!

As I mentioned above, the POSIX file descriptor for a file opened as a STDIO stream must be used very carefully if you don't want to confuse the internal runtime state of the STDIO library. We can add here that it is important not to confuse yourself with it either.

The most correct way to find the current size of a file opened as a STDIO stream is to seek to the end of it and then ask where the stream pointer is by using only STDIO functions.

like image 61
Greg A. Woods Avatar answered Sep 21 '22 15:09

Greg A. Woods


Isn't an unbuffered write of zero bytes supposed to truncate the file at that point?

See this question for how to set unbuffered: Unbuffered I/O in ANSI C

like image 42
Joshua Avatar answered Sep 20 '22 15:09

Joshua