Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using fseek and ftell to determine the size of a file has a vulnerability?

Tags:

c

file

fseek

I've read posts that show how to use fseek and ftell to determine the size of a file.

FILE *fp;
long file_size;
char *buffer;

fp = fopen("foo.bin", "r");
if (NULL == fp) {
 /* Handle Error */
}

if (fseek(fp, 0 , SEEK_END) != 0) {
  /* Handle Error */
}

file_size = ftell(fp);
buffer = (char*)malloc(file_size);
if (NULL == buffer){
  /* handle error */
}

I was about to use this technique but then I ran into this link that describes a potential vulnerability.

The link recommends using fstat instead. Can anyone comment on this?

like image 435
Frank Avatar asked May 11 '11 00:05

Frank


2 Answers

If your goal is to find the size of a file, definitely you should use fstat() or its friends. It's a much more direct and expressive method--you are literally asking the system to tell you the file's statistics, rather than the more roundabout fseek/ftell method.

A bonus tip: if you only want to know if the file is available, use access() rather than opening the file or even stat'ing it. This is an even simpler operation which many programmers aren't aware of.

like image 191
John Zwinck Avatar answered Sep 22 '22 03:09

John Zwinck


I'd tend to agree with their basic conclusion that you generally shouldn't use the fseek/ftell code directly in the mainstream of your code -- but you probably shouldn't use fstat either. If you want the size of a file, most of your code should use something with a clear, direct name like filesize.

Now, it probably is better to implement that using fstat where available, and (for example) FindFirstFile on Windows (the most obvious platform where fstat usually won't be available).

The other side of the story is that many (most?) of the limitations on fseek with respect to binary files actually originated with CP/M, which didn't explicitly store the size of a file anywhere. The end of a text file was signaled by a control-Z. For a binary file, however, all you really knew was what sectors were used to store the file. In the last sector, you had some amount of unused data that was often (but not always) zero-filled. Unfortunately, there might be zeros that were significant, and/or non-zero values that weren't significant.

If the entire C standard had been written just before being approved (e.g., if it had been started in 1988 and finished in 1989) they'd probably have ignored CP/M completely. For better or worse, however, they started work on the C standard in something like 1982 or so, when CP/M was still in wide enough use that it couldn't be ignored. By the time CP/M was gone, many of the decisions had already been made and I doubt anybody wanted to revisit them.

For most people today, however, there's just no point -- most code won't port to CP/M without massive work; this is one of the relatively minor problems to deal with. Making a modern program run in only 48K (or so) of memory for both the code and data is a much more serious problem (having a maximum of a megabyte or so for mass storage would be another serious problem).

CERT does have one good point though: you probably should not (as is often done) find the size of a file, allocate that much space, and then assume the contents of the file will fit there. Even though the fseek/ftell will give you the correct size with modern systems, that data could be stale by the time you actually read the data, so you could overrun your buffer anyway.

like image 41
Jerry Coffin Avatar answered Sep 19 '22 03:09

Jerry Coffin