My application uses lseek()
to seek the desired position to write data.
The file is successfully opened using open()
and my application was able to use lseek()
and write()
lots of times.
At a given time, for some users and not easily reproducable, lseek()
returns -1 with an errno
of 9. File is not closed before this and the filehandle (int) isn't reset.
After this, another file is created; open()
is okay again and lseek()
and write()
works again.
To make it even worse, this user tried the complete sequence again and all was well.
So my question is, can the OS close the file handle for me for some reason? What could cause this? A file indexer or file scanner of some sort?
What is the best way to solve this; is this pseudo code the best solution? (never mind the code layout, will create functions for it)
int fd=open(...);
if (fd>-1) {
long result = lseek(fd,....);
if (result == -1 && errno==9) {
close(fd..); //make sure we try to close nicely
fd=open(...);
result = lseek(fd,....);
}
}
Anybody experience with something similar?
Summary: file seek and write works okay for a given fd and suddenly gives back errno=9 without a reason.
So my question is, can the OS close the file handle for me for some reason? What could cause > this? A file indexer or file scanner of some sort?
No, this will not happen.
What is the best way to solve this; is this pseudo code the best solution? (never mind the code layout, will create functions for it)
No, the best way is to find the bug and fix it.
Anybody experience with something similar?
I've seen fds getting messed up many times, resulting in EBADF in the some of the cases, and blowing up spectacularly in others, it's been:
if(fd = foo[i].fd)
when they meant if(fd == foo[i].fd)
If you can find a way to reproduce this problem, run your program under 'strace', so you can see whats going on.
The OS shall not close file handles randomly (I am assuming a Unix-like system). If your file handle is closed, then there is something wrong with your code, most probably elsewhere (thanks to the C language and the Unix API, this can be really anywhere in the code, and may be due to, e.g., a slight buffer overflow in some piece of code which really looks like to be unrelated).
Your pseudo-code is the worst solution, since it will give you the impression of having fixed the problem, while the bug still lurks.
I suggest that you add debug prints (i.e. printf()
calls) wherever you open and close a file or socket. Also, try Valgrind.
(I just had yesterday a spooky off-by-1 buffer overflow, which damaged the least significant byte of a temporary slot generated by the compiler to save a CPU register; the indirect effect was that a structure in another function appeared to be shifted by a few bytes. It took me quite some time to understand what was going on, including some thorough reading of Mips assembly code).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With