Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What makes a C standard library function dangerous, and what is the alternative?

Tags:

c

function

People also ask

What is a dangerous C function?

Dangers in C/C++ C users must avoid using dangerous functions that do not check bounds unless they've ensured that the bounds will never get exceeded. Functions to avoid in most cases (or ensure protection) include the functions strcpy(3), strcat(3), sprintf(3) (with cousin vsprintf(3)), and gets(3).

What is C standard library function?

The C standard library provides macros, type definitions and functions for tasks such as string handling, mathematical computations, input/output processing, memory management, and several other operating system services.


In the old days, most of the string functions had no bounds checking. Of course they couldn't just delete the old functions, or modify their signatures to include an upper bound, that would break compatibility. Now, for almost every one of those functions, there is an alternative "n" version. For example:

strcpy -> strncpy
strlen -> strnlen
strcmp -> strncmp
strcat -> strncat
strdup -> strndup
sprintf -> snprintf
wcscpy -> wcsncpy
wcslen -> wcsnlen

And more.

See also https://github.com/leafsr/gcc-poison which is a project to create a header file that causes gcc to report an error if you use an unsafe function.


Yes, fgets(..., ..., STDIN) is a good alternative to gets(), because it takes a size parameter (gets() has in fact been removed from the C standard entirely in C11). Note that fgets() is not exactly a drop-in replacement for gets(), because the former will include the terminating \n character if there was room in the buffer for a complete line to be read.

scanf() is considered problematic in some cases, rather than straight-out "bad", because if the input doesn't conform to the expected format it can be impossible to recover sensibly (it doesn't let you rewind the input and try again). If you can just give up on badly formatted input, it's useable. A "better" alternative here is to use an input function like fgets() or fgetc() to read chunks of input, then scan it with sscanf() or parse it with string handling functions like strchr() and strtol(). Also see below for a specific problem with the "%s" conversion specifier in scanf().

It's not a standard C function, but the BSD and POSIX function mktemp() is generally impossible to use safely, because there is always a TOCTTOU race condition between testing for the existence of the file and subsequently creating it. mkstemp() or tmpfile() are good replacements.

strncpy() is a slightly tricky function, because it doesn't null-terminate the destination if there was no room for it. Despite the apparently generic name, this function was designed for creating a specific style of string that differs from ordinary C strings - strings stored in a known fixed width field where the null terminator is not required if the string fills the field exactly (original UNIX directory entries were of this style). If you don't have such a situation, you probably should avoid this function.

atoi() can be a bad choice in some situations, because you can't tell when there was an error doing the conversion (e.g., if the number exceeded the range of an int). Use strtol() if this matters to you.

strcpy(), strcat() and sprintf() suffer from a similar problem to gets() - they don't allow you to specify the size of the destination buffer. It's still possible, at least in theory, to use them safely - but you are much better off using strncat() and snprintf() instead (you could use strncpy(), but see above). Do note that whereas the n for snprintf() is the size of the destination buffer, the n for strncat() is the maximum number of characters to append and does not include the null terminator. Another alternative, if you have already calculated the relevant string and buffer sizes, is memmove() or memcpy().

On the same theme, if you use the scanf() family of functions, don't use a plain "%s" - specify the size of the destination e.g. "%200s".


strtok() is generally considered to be evil because it stores state information between calls. Don't try running THAT in a multithreaded environment!


Strictly speaking, there is one really dangerous function. It is gets() because its input is not under the control of the programmer. All other functions mentioned here are safe in and of themselves. "Good" and "bad" boils down to defensive programming, namely preconditions, postconditions and boilerplate code.

Let's take strcpy() for example. It has some preconditions that the programmer must fulfill before calling the function. Both strings must be valid, non-NULL pointers to zero terminated strings, and the destination must provide enough space with a final string length inside the range of size_t. Additionally, the strings are not allowed to overlap.

That is quite a lot of preconditions, and none of them is checked by strcpy(). The programmer must be sure they are fulfilled, or he must explicitly test them with additional boilerplate code before calling strcpy():

n = DST_BUFFER_SIZE;
if ((dst != NULL) && (src != NULL) && (strlen(dst)+strlen(src)+1 <= n))
{
    strcpy(dst, src);
}

Already silently assuming the non-overlap and zero-terminated strings.

strncpy() does include some of these checks, but it adds another postcondition the programmer must take care for after calling the function, because the result may not be zero-terminated.

strncpy(dst, src, n);
if (n > 0)
{
    dst[n-1] = '\0';
}

Why are these functions considered "bad"? Because they would require additional boilerplate code for each call to really be on the safe side when the programmer assumes wrong about the validity, and programmers tend to forget this code.

Or even argue against it. Take the printf() family. These functions return a status that indicate error and success. Who checks if the output to stdout or stderr succeeded? With the argument that you can't do anything at all when the standard channels are not working. Well, what about rescuing the user data and terminating the program with an error-indicating exit code? Instead of the possible alternative of crash and burn later with corrupted user data.

In a time- and money-limited environment it is always the question of how much safety nets you really want and what is the resulting worst case scenario? If it is a buffer overflow as in case of the str-functions, then it makes sense to forbid them and probably provide wrapper functions with the safety nets already within.

One final question about this: What makes you sure that your "good" alternatives are really good?


Any function that does not take a maximum length parameter and instead relies on an end-of- marker to be present (such as many 'string' handling functions).

Any method that maintains state between calls.


  • sprintf is bad, does not check size, use snprintf
  • gmtime, localtime -- use gmtime_r, localtime_r