Every C programmer knows there is no way to securely use gets
unless standard input is connected to a trusted source. But why didn't the developers of C notice such a glaring mistake before it was made an official part of the C standard? And why did it take until C11 to remove it from the standard and replace it with a function that performs bounds-checking? I'm aware fgets
is typically used in its place, but that has the annoying habit of keeping the \n
at the end.
Therefore, the function gets was removed from the C standard library in 2011. The C11 version of <stdio.
C has been standardized by ANSI since 1989 (ANSI C) and by the International Organization for Standardization (ISO). C is an imperative procedural language supporting structured programming, lexical variable scope, and recursion, with a static type system.
C11 is the latest ANSI C specification, ISO/IEC 9899:2011. C11 looked to address the issues of C99 and to more closely match the C++ standard, C++11. It changes some C99 features required to optional. Some of the features include variable length arrays and complex numbers.
Type “gcc –version” in command prompt to check whether C compiler is installed in your machine. Type “g++ –version” in command prompt to check whether C++ compiler is installed in your machine.
The answer is simply that C is a very old language, dating back to the early 1970s. The sort of security threats we take for granted today weren't on the horizon when the language was first developed.
For a long time, C was the in-house language at AT&T. It was difficult to find commercial compilers for C until the late 1970s. But when the UNIX operating system was rewritten in C, compilers became more readily available, and the language took off, especially after Kernighan and Ritchie's 1978 standard reference, The C Programming Language
.
Despite its widespread and growing popularity, the language itself wasn't standardized until 1989. By that point, C was nearly 20 years old and there was a lot of installed C code. The standards committee was relatively conservative; it worked on the assumption that the standard would codify existing practices rather than require new ways of doing things. The buffer overflow vulnerability of gets()
seemed trivial compared to the cost of declaring a large portion of the installed code base nonstandard.
The Morris internet worm of 1988 did make clear the need for more secure coding practises, but even so, back in the late 1980s the internet was still extremely nascent. (If I remember correctly, an early 1990s Macintosh book by David Pogue answered the question of how to connect a Mac to the Internet with something to the effect of "Don't bother, the Internet isn't worth the effort".) One can hardly fault the standards committee for misjudging the exponential growth of the Internet and attended security threats.
When the standard was revised in 1999, matters had changed, of course. However, the committee again chose to be cautious about invalidating existing code, and so to deprecate rather than remove gets()
altogether. It's debatable whether this was the right decision, but it wasn't obviously the wrong one.
Retaining gets()
in the C11 standard would obviously have been the wrong decision, and the current standard very properly eliminates it. But your question rests on the assumption that this was "always already" the right thing to do, and from a historical perspective, that assumption seems questionable.
The mandate for the initial ANSI standard was to codify existing practice, not invent a new language.
That's made clear in the rationale documents:
The original X3J11 charter clearly mandated codifying common existing practice, and the C89 Committee held fast to precedent wherever that was clear and unambiguous. The vast majority of the language defined by C89 was precisely the same as defined in Appendix A of the first edition of The C Programming Language by Brian Kernighan and Dennis Ritchie, and as was implemented in almost all C translators of the time. (This document is hereinafter referred to as K&R.)
Hence, because gets
was part of the language, it was made part of the standard. There are other things that are unsafe that are still there, practitioners are expected to know how to use their tools wisely.
And, if you're worried by the superfluous newline, it's easy enough to fix:
{
size_t len = strlen (buffer);
if ((len > 0) && (buffer[len-1] == '\n'))
buffer[len-1] = '\0';
}
or the simpler:
buffer[strcspn (buffer, "\n")] = '\n';
You could even write your own fgets
front end to do that for you, such as this one here, apparently written by one of the more intelligent and good looking members of SO :-)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With