I have a long-running C program which opens a file in the beginning, writes out "interesting" stuff during execution, and closes the file just before it finishes. The code, compiled with gcc -o test test.c
(gcc version 5.3.1.) looks like as follows:
//contents of test.c
#include<stdio.h>
FILE * filept;
int main() {
filept = fopen("test.txt","w");
unsigned long i;
for (i = 0; i < 1152921504606846976; ++i) {
if (i == 0) {//This case is interesting!
fprintf(filept, "Hello world\n");
}
}
fclose(filept);
return 0;
}
The problem is that since this is a scientific computation (think of searching for primes, or whatever is your favourite hard-to-crack stuff) it could really run for a very long time. Since I determined that I am not patient enough, I would like to abort the current computation, but I would like to do this in an intelligent way by somehow forcing the program by external means to flush out all the data that is currently in the OS buffer/disk cache, wherever.
Here is what I have tried (for this bogus program above, and of course not for the real deal which is currently still running):
kill -6 <PID>
(and also kill -3 <PID>
) -- as suggested by @BartekBanachewicz,but after either of these approaches the file test.txt
created in the very beginning of the program remains empty. This means, that the contents of fprintf()
were left in some intermediate buffer during the computation, waiting for some OS/hardware/software flush signal, but since no such a signal was obtained, the contents disappeared. This also means, that the comment made by @EJP
Your question is based on a fallacy. 'Stuff that is in the OS buffer/disk cache' won't be lost.
does not seem to apply here. Experience shows, that stuff indeed get lost.
I am using Ubuntu 16.04 and I am willing to attach a debugger to this process if it is possible, and if it is safe to retrieve the data this way. Since I never done such a thing before, I would appreciate if someone would provide me a detailed answer how to get the contents flushed into the disk safely and surely. Or I am open to other methods as well. There is no room for error here, as I am not going to rerun the program again.
Note: Sure I could have opened and closed a file inside the if
branch, but that is extremely inefficient once you have many things to be written. Recompiling the program is not possible, as it is still in the middle of some computation.
Note2: the original question was asked the same question in a slightly more abstract way related to C++, and was tagged as such (that is why people in the comments suggesting std::flush()
, which wouldn't help even if this was a C++ question). Well, I guess I made a major edit then.
Somewhat related: Will data written via write() be flushed to disk if a process is killed?
Can I just add some clarity? Obviously months have passed, and I imagine your program isn't running any more ... but there's some confusion here about buffering which still isn't clear.
As soon as you use the stdio library and FILE *
, you will by default have a fairly small (implementation dependent, but typically some KB) buffer inside your program which is accumulating what you write, and flushing it to the OS when it's full, (or on file close). When you kill your process, it is this buffer that gets lost.
If the data has been flushed to the OS, then it is kept in a unix file buffer until the OS decides to persist it to disk (usually fairly soon), or someone runs the sync
command. If you kill the power on your computer, then this buffer gets lost as well. You probably don't care about this scenario, because you probably aren't planning to yank the power! But this is what @EJP was talking about (re Stuff that is in the OS buffer/disk cache' won't be lost): your problem is the stdio cache, not the OS.
In an ideal world, you'd write your app so it fflushed (or std::flush()
) at key points. In your example, you'd say:
if (i == 0) {//This case is interesting!
fprintf(filept, "Hello world\n");
fflush(filept);
}
which would cause the stdio buffer to flush to the OS. I imagine your real writer is more complex, and in that situation I would try to make the fflush happen "often but not too often". Too rare, and you lose data when you kill the process, too often and you lose the performance benefits of buffering if you are writing a lot.
In your described situation, where the program is already running and can't be stopped and rewritten, then your only hope, as you say, is to stop it in a debugger. The details of what you need to do depend on the implementation of the std lib, but you can usually look inside the FILE *filept
object and start following pointers, messy though. @ivan_pozdeev's comment about executing std::flush()
or fflush()
within the debugger is helpful.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With