Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is scanf guaranteed to not change the value on failure?

If a scanf family function fails to match the current specifier, is it permitted to write to the storage where it would have stored the value on success?

On my system the following outputs 213 twice but is that guaranteed?

The language in the standard (C99 or C11) does not seem to clearly specify that the original value should remain unchanged (whether it was indeterminate or not).

#include <stdio.h>

int main()
{
    int d = 213;

    // matching failure
    sscanf("foo", "%d", &d);
    printf("%d\n", d);

    // input failure
    sscanf("", "%d", &d);
    printf("%d\n", d);
}
like image 634
M.M Avatar asked Sep 07 '14 22:09

M.M


2 Answers

The relevant part of the C11 standard is (7.21.6.2, for fscanf):

7 A directive that is a conversion specification defines a set of matching input sequences, as described below for each specifier. A conversion specification is executed in the following steps:

8 […]

9 An input item is read from the stream, unless the specification includes an n specifier. An input item is defined as the longest sequence of input characters which does not exceed any specified field width and which is, or is a prefix of, a matching input sequence.285) The first character, if any, after the input item remains unread. If the length of the input item is zero, the execution of the directive fails; this condition is a matching failure unless end-of-file, an encoding error, or a read error prevented input from the stream, in which case it is an input failure.

10 Except in the case of a % specifier, the input item (or, in the case of a %n directive, the count of input characters) is converted to a type appropriate to the conversion specifier. If the input item is not a matching sequence, the execution of the directive fails: this condition is a matching failure. Unless assignment suppression was indicated by a *, the result of the conversion is placed in the object pointed to by the first argument following the format argument that has not already received a conversion result. […]

To me, the words “step” and “If the length of the input item is zero, the execution of the directive fail” indicate that if the input does not match a specifier in the format, interpretation stops before any assignment for that specifier has occurred.


On the other hand, the subclause 4 about the ones quoted makes it clear that specifiers up to the failing one are assigned, again using language appropriate for ordered sequences of events:

4 The fscanf function executes each directive of the format in turn. When all directives have been executed, or if a directive fails (as detailed below), the function returns.

like image 135
Pascal Cuoq Avatar answered Oct 06 '22 13:10

Pascal Cuoq


Judging from ISO/IEC 9899:2011 §7.21.6.2 The fscanf function:

¶10 Except in the case of a % specifier, the input item (or, in the case of a %n directive, the count of input characters) is converted to a type appropriate to the conversion specifier. If the input item is not a matching sequence, the execution of the directive fails: this condition is a matching failure. Unless assignment suppression was indicated by a *, the result of the conversion is placed in the object pointed to by the first argument following the format argument that has not already received a conversion result. If this object does not have an appropriate type, or if the result of the conversion cannot be represented in the object, the behavior is undefined.

In the larger context, this seems to mean that the assignment to the target variable only occurs after the conversion is successful. For numeric types, that makes sense and is readily achievable. For string types, it is not so clear cut, but it should work the same way (the text quoted does state that the assignment only occurs if there is no matching failure or input failure). However, if there is an encoding error part way through a string (%s or %30c or %[a-z]), it would not be surprising to find that the first part of the string is changed even though the conversion as a whole failed. This could probably be regarded as a bug. Stimulating the bug accurately might be hard; for example, it might require UTF-8 input and an invalid byte such as 0xC0 or 0xF5 in the input stream.

like image 29
Jonathan Leffler Avatar answered Oct 06 '22 13:10

Jonathan Leffler