Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why was gets part of the C standard in the first place?

Every C programmer knows there is no way to securely use gets unless standard input is connected to a trusted source. But why didn't the developers of C notice such a glaring mistake before it was made an official part of the C standard? And why did it take until C11 to remove it from the standard and replace it with a function that performs bounds-checking? I'm aware fgets is typically used in its place, but that has the annoying habit of keeping the \n at the end.

like image 386
flarn2006 Avatar asked Aug 02 '13 01:08

flarn2006


People also ask

When was gets removed C?

Therefore, the function gets was removed from the C standard library in 2011. The C11 version of <stdio.

What are the standards of C language?

C has been standardized by ANSI since 1989 (ANSI C) and by the International Organization for Standardization (ISO). C is an imperative procedural language supporting structured programming, lexical variable scope, and recursion, with a static type system.

What is the difference between C99 and C11?

C11 is the latest ANSI C specification, ISO/IEC 9899:2011. C11 looked to address the issues of C99 and to more closely match the C++ standard, C++11. It changes some C99 features required to optional. Some of the features include variable length arrays and complex numbers.

How do I know what version of C?

Type “gcc –version” in command prompt to check whether C compiler is installed in your machine. Type “g++ –version” in command prompt to check whether C++ compiler is installed in your machine.


2 Answers

The answer is simply that C is a very old language, dating back to the early 1970s. The sort of security threats we take for granted today weren't on the horizon when the language was first developed.

For a long time, C was the in-house language at AT&T. It was difficult to find commercial compilers for C until the late 1970s. But when the UNIX operating system was rewritten in C, compilers became more readily available, and the language took off, especially after Kernighan and Ritchie's 1978 standard reference, The C Programming Language.

Despite its widespread and growing popularity, the language itself wasn't standardized until 1989. By that point, C was nearly 20 years old and there was a lot of installed C code. The standards committee was relatively conservative; it worked on the assumption that the standard would codify existing practices rather than require new ways of doing things. The buffer overflow vulnerability of gets() seemed trivial compared to the cost of declaring a large portion of the installed code base nonstandard.

The Morris internet worm of 1988 did make clear the need for more secure coding practises, but even so, back in the late 1980s the internet was still extremely nascent. (If I remember correctly, an early 1990s Macintosh book by David Pogue answered the question of how to connect a Mac to the Internet with something to the effect of "Don't bother, the Internet isn't worth the effort".) One can hardly fault the standards committee for misjudging the exponential growth of the Internet and attended security threats.

When the standard was revised in 1999, matters had changed, of course. However, the committee again chose to be cautious about invalidating existing code, and so to deprecate rather than remove gets() altogether. It's debatable whether this was the right decision, but it wasn't obviously the wrong one.

Retaining gets() in the C11 standard would obviously have been the wrong decision, and the current standard very properly eliminates it. But your question rests on the assumption that this was "always already" the right thing to do, and from a historical perspective, that assumption seems questionable.

like image 152
verbose Avatar answered Oct 05 '22 11:10

verbose


The mandate for the initial ANSI standard was to codify existing practice, not invent a new language.

That's made clear in the rationale documents:

The original X3J11 charter clearly mandated codifying common existing practice, and the C89 Committee held fast to precedent wherever that was clear and unambiguous. The vast majority of the language defined by C89 was precisely the same as defined in Appendix A of the first edition of The C Programming Language by Brian Kernighan and Dennis Ritchie, and as was implemented in almost all C translators of the time. (This document is hereinafter referred to as K&R.)

Hence, because gets was part of the language, it was made part of the standard. There are other things that are unsafe that are still there, practitioners are expected to know how to use their tools wisely.

And, if you're worried by the superfluous newline, it's easy enough to fix:

{
    size_t len = strlen (buffer);
    if ((len > 0) && (buffer[len-1] == '\n'))
        buffer[len-1] = '\0';
}

or the simpler:

buffer[strcspn (buffer, "\n")] = '\n';

You could even write your own fgets front end to do that for you, such as this one here, apparently written by one of the more intelligent and good looking members of SO :-)

like image 28
paxdiablo Avatar answered Oct 05 '22 11:10

paxdiablo