Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the reasons to check for error on close()?

Tags:

c

linux

posix

Note: Please read to the end before marking this as duplicate. While it's similar, the scope of what I'm looking for in an answer extends beyond what the previous question was asking for.

Widespread practice, which I tend to agree with, tends to be treating close purely as a resource-deallocation function for file descriptors rather than a potential IO operation with meaningful failure cases. And indeed, prior to the resolution of issue 529, POSIX left the state of the file descriptor (i.e. whether it was still allocated or not) unspecified after errors, making it impossible to respond portably to errors in any meaningful way.

However, a lot of GNU software goes to great lengths to check for errors from close, and the Linux man page for close calls failure to do so "a common but nevertheless serious programming error". NFS and quotas are cited as circumstances under which close might produce an error but does not give details.

What are the situations under which close might fail, on real-world systems, and are they relevant today? I'm particularly interested in knowing whether there are any modern systems where close fails for any non-NFS, non-device-node-specific reasons, and as for NFS or device-related failures, under what conditions (e.g. configurations) they might be seen.

like image 359
R.. GitHub STOP HELPING ICE Avatar asked Jun 29 '14 15:06

R.. GitHub STOP HELPING ICE


People also ask

Should check returned error before deferring resp body close ()?

In most cases, for closing a response body, the answer to both questions is... absolutely nothing. If there's nothing you'd do if there was an error and the error has no appreciable impact, there's no reason to check it.

What does fclose return if it fails?

The fclose() function returns 0 if it successfully closes the stream, or EOF if any errors were detected.

Can fclose fail?

The fclose() call can fail, and should be error-checked just as assiduously as all the other file operations.


2 Answers

Once upon a time (24 march, 2007), Eric Sosman had the following tale to share in the comp.lang.c newsgroup:

(Let me begin by confessing to a little white lie: It wasn't fclose() whose failure went undetected, but the POSIX close() function; this part of the application used POSIX I/O. The lie is harmless, though, because the C I/O facilities would have failed in exactly the same way, and an undetected failure would have had the same consequences. I'll describe what happened in terms of C's I/O to avoid dwelling on POSIX too much.)

The situation was very much as Richard Tobin described. The application was a document management system that loaded a document file into memory, applied the user's edits to the in- memory copy, and then wrote everything to a new file when told to save the edits. It also maintained a one-level "old version" backup for safety's sake: the Save operation wrote to a temp file, and then if that was successful it deleted the old backup, renamed the old document file to the backup name, and renamed the temp file to the document. bak -> trash, doc -> bak, tmp -> doc.

The write-to-temp-file step checked almost everything. The fopen(), obviously, but also all the fwrite()s and even a final fflush() were checked for error indications -- but the fclose() was not. And on one system it happened that the last few disk blocks weren't actually allocated until fclose() -- the I/O system sat atop VMS' lower-level file access machinery, and a little bit of asynchrony was inherent in the arrangement.

The customer's system had disk quotas enabled, and the victim was right up close to his limit. He opened a document, edited for a while, saved his work thus far, and exceeded his quota -- which went undetected because the error didn't appear until the unchecked fclose(). Thinking that the save succeeded, the application discarded the old backup, renamed the original document to become the backup, and renamed the truncated temp file to be the new document. The user worked a little longer and saved again -- same thing, except you'll note that this time the only surviving complete file got deleted, and both the backup and the master document file are truncated. Result: the whole document file became trash, not just the latest session of work but everything that had gone before.

As Murphy would have it, the victim was the boss of the department that had purchased several hundred licenses for our software, and I got the privilege of flying to St. Louis to be thrown to the lions.

[...]

In this case, the failure of fclose() would (if detected) have stopped the delete-and-rename sequence. The user would have been told "Hey, there was a problem saving the document; do something about it and try again. Meanwhile, nothing has changed on disk." Even if he'd been unable to save his latest batch of work, he would at least not have lost everything that went before.

like image 78
Nisse Engström Avatar answered Oct 05 '22 17:10

Nisse Engström


Consider the inverse of your question: "Under what situations can we guarantee that close will succeed?" The answer is:

  • when you call it correctly, and
  • when you know that the file system the file is on does not return errors from close in this OS and Kernel version

If you are convinced that you program doesn't have any logic errors and you have complete control over the Kernel and file system, then you don't need to check the return value of close.

Otherwise, you have to ask yourself how much you care about diagnosing problems with close. I think there is value in checking and logging the error for diagnostic purposes:

  • If a coder makes a logic error and passes an invalid fd to close, then you'll be able to quickly track it down. This may help to catch a bug early before it causes problems.
  • If a user runs the program in an environment where close does return an error when (for example) data was not flushed, then you'll be able to quickly diagnose why the data got corrupted. It's an easy red flag because you know the error should not occur.
like image 39
Anton Avatar answered Oct 05 '22 16:10

Anton