Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is scanf("%d%d", &x, &x) well defined?

Is the following code well defined?

#include <stdio.h>

int ScanFirstOrSecond(const char *s, int *dest) {
    return sscanf(s, "%d%d", dest, dest);
}

int main(void) {
    int x = 4;
    ScanFirstOrSecond("5", &x);
    printf("%d\n", x);  // prints 5

    // Here is the tricky bit
    ScanFirstOrSecond("6 7", &x);
    printf("%d\n", x);  // prints 7
    return 0;
}

In other words, do the ... arguments have an implied restrict to them?

The most applicable C spec I found is

The fscanf function executes each directive of the format in turn. ... C11dr §7.21.6.2 4

like image 310
chux - Reinstate Monica Avatar asked Mar 01 '16 00:03

chux - Reinstate Monica


3 Answers

The short answer is: Yes, it is defined:

scanf will attempt to convert a sequence of bytes from stdin as an integer written in base 10 with optional initial spaces and an optional sign. If successful, the number will be stored into x. scanf will then perform these steps a second time. The return value can be EOF, 0, 1 or 2, and for the latter 2, the last number converted will have been stored into x.

The long answer is somewhat more subtile:

It seems the C Standard does specify that the values are stored in the order of the format string. Quoting the C11 Standard:

7.21.6.2 The fscanf function

...

4 The fscanf function executes each directive of the format in turn. When all directives have been executed, or if a directive fails (as detailed below), the function returns.

...

7 A directive that is a conversion specification defines a set of matching input sequences, as described below for each specifier. A conversion specification is executed in the following steps:

...

10 Except in the case of a % specifier, the input item (or, in the case of a %n directive, the count of input characters) is converted to a type appropriate to the conversion specifier. If the input item is not a matching sequence, the execution of the directive fails: this condition is a matching failure. Unless assignment suppression was indicated by a *, the result of the conversion is placed in the object pointed to by the first argument following the format argument that has not already received a conversion result.

...

16 The fscanf function returns the value of the macro EOF if an input failure occurs before the first conversion (if any) has completed. Otherwise, the function returns the number of input items assigned, which can be fewer than provided for, or even zero, in the event of an early matching failure.

Nowhere else in this specification are any accesses to the output objects even mentioned.

Yet the wording of the Standard seems to indicate that if 2 pointers point to the same object, the behavior might be unexpected: the result of the conversion is placed in the object pointed to by the first argument following the format argument that has not already received a conversion result. This phrase is somewhat ambiguous: what does that has not already received a conversion result refer to? the object or the argument? Objects receive conversion results, not the pointer arguments. In your contorted example, the object x has already received a conversion result, so it should not receive another one... But as noted by supercat, this interpretation is overtly restrictive as it would imply that all converted values be stored into the first target object.

So it appears fully specified and well defined, but the wording of the specification could be perfected to remove a potential ambiguity.

like image 152
chqrlie Avatar answered Nov 16 '22 18:11

chqrlie


scanf() family functions execute the directions you leave them in the format string strictly in turn. So the first value will get read in, and then the second one, overwriting the first. Nothing UB here.

like image 25
Magisch Avatar answered Nov 16 '22 17:11

Magisch


Yes, well defined. It means "read the first token into *dest, then read the second token into *dest again". It's weird but legal. Yes, because sscanf() executes directives in the format string in strict order.

like image 2
ddbug Avatar answered Nov 16 '22 19:11

ddbug