Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the underlying difference between printf(s) and printf("%s", s)?

The question is plain and simple, s is a string, I suddenly got the idea to try to use printf(s) to see if it would work and I got a warning in one case and none in the other.

char* s = "abcdefghij\n"; printf(s);  // Warning raised with gcc -std=c11:  // format not a string literal and no format arguments [-Wformat-security]  // On the other hand, if I use   char* s = "abc %d efg\n"; printf(s, 99);  // I get no warning whatsoever, why is that?  // Update, I've tested this: char* s = "random %d string\n"; printf(s, 99, 50);  // Results: no warning, output "random 99 string". 

So what's the underlying difference between printf(s) and printf("%s", s) and why do I get a warning in just one case?

like image 202
Mikael Avatar asked Sep 09 '16 15:09

Mikael


People also ask

What is %s in printf?

%s and string We can print the string using %s format specifier in printf function. It will print the string from the given starting address to the null '\0' character. String name itself the starting address of the string. So, if we give string name it will print the entire string.

What is the difference between %C and %S in C programming?

"%s" expects a pointer to a null-terminated string ( char* ). "%c" expects a character ( int ).

What is the difference between printf and sprintf )?

The only difference between sprintf() and printf() is that sprintf() writes data into a character array, while printf() writes data to stdout, the standard output device.

What is difference between print and printf in C language?

The difference between printf and print is the format argument. This is an expression whose value is taken as a string; it specifies how to output each of the other arguments. It is called the format string. The format string is very similar to that in the ISO C library function printf() .


2 Answers

In the first case, the non-literal format string could perhaps come from user code or user-supplied (run-time) data, in which case it might contain %s or other conversion specifications, for which you've not passed the data. This can lead to all sorts of reading problems (and writing problems if the string includes %n — see printf() or your C library's manual pages).

In the second case, the format string controls the output and it doesn't matter whether any string to be printed contains conversion specifications or not (though the code shown prints an integer, not a string). The compiler (GCC or Clang is used in the question) assumes that because there are arguments after the (non-literal) format string, the programmer knows what they're up to.

The first is a 'format string' vulnerability. You can search for more information on the topic.

GCC knows that most times the single argument printf() with a non-literal format string is an invitation to trouble. You could use puts() or fputs() instead. It is sufficiently dangerous that GCC generates the warnings with the minimum of provocation.

The more general problem of a non-literal format string can also be problematic if you are not careful — but extremely useful assuming you are careful. You have to work harder to get GCC to complain: it requires both -Wformat and -Wformat-nonliteral to get the complaint.

From the comments:

So ignoring the warning, as if I really know what I am doing and there will be no errors, is one or another more efficient to use or are they the same? Considering both space and time.

Of your three printf() statements, given the tight context that the variable s is as assigned immediately above the call, there is no actual problem. But you could use puts(s) if you omitted the newline from the string or fputs(s, stdout) as it is and get the same result, without the overhead of printf() parsing the entire string to find out that it is all simple characters to be printed.

The second printf() statement is also safe as written; the format string matches the data passed. There is no significant difference between that and simply passing the format string as a literal — except that the compiler can do more checking if the format string is a literal. The run-time result is the same.

The third printf() passes more data arguments than the format string needs, but that is benign. It isn't ideal, though. Again, the compiler can check better if the format string is a literal, but the run-time effect is practically the same.

From the printf() specification linked to at the top:

Each of these functions converts, formats, and prints its arguments under control of the format. The format is a character string, beginning and ending in its initial shift state, if any. The format is composed of zero or more directives: ordinary characters, which are simply copied to the output stream, and conversion specifications, each of which shall result in the fetching of zero or more arguments. The results are undefined if there are insufficient arguments for the format. If the format is exhausted while arguments remain, the excess arguments shall be evaluated but are otherwise ignored.

In all these cases, there is no strong indication of why the format string is not a literal. However, one reason for wanting a non-literal format string might be that sometimes you print the floating point numbers in %f notation and sometimes in %e notation, and you need to choose which at run-time. (If it is simply based on value, %g might be appropriate, but there are times when you want the explicit control — always %e or always %f.)

like image 174
Jonathan Leffler Avatar answered Sep 30 '22 09:09

Jonathan Leffler


The warning says it all.

First, to discuss about the issue, as per the signature, the first parameter to printf() is a format string which can contain format specifiers (conversion specifier). In case, a string contains a format specifier and the corresponding argument is not supplied, it invokes undefined behavior.

So, a cleaner (or safer) approach (of printing a string which needs no format specification) would be puts(s); over printf(s); (the former does not process s for any conversion specifiers, removing the reason for the possible UB in the later case). You can choose fputs(), if you're worried about the ending newline that automatically gets added in puts().


That said, regarding the warning option, -Wformat-security from the online gcc manual

At present, this warns about calls to printf and scanf functions where the format string is not a string literal and there are no format arguments, as in printf (foo);. This may be a security hole if the format string came from untrusted input and contains %n.

In your first case, there's only one argument supplied to printf(), which is not a string literal, rather a variable, which can be very well generated/ populated at run time, and if that contains unexpected format specifiers, it may invoke UB. Compiler has no way to check for the presence of any format specifier in that. That is the security problem there.

In the second case, the accompanying argument is supplied, the format specifier is not the only argument passed to printf(), so the first argument need not to be verified. Hence the warning is not there.


Update:

Regarding the third one, with excess argument that required by the supplied format string

printf(s, 99, 50); 

quoting from C11, chapter §7.21.6.1

[...] If the format is exhausted while arguments remain, the excess arguments are evaluated (as always) but are otherwise ignored. [...]

So, passing excess argument is not a problem (from the compiler perspective) at all and it is well defined. NO scope for any warning there.

like image 40
Sourav Ghosh Avatar answered Sep 30 '22 08:09

Sourav Ghosh