Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sscanf(s, "%u", &v) matching signed integers

After Cppcheck was complaining about "%u" as the wrong format specifier to scan into an int variable, I changed the format into "%d", but when having a second look on it before committing the change, I thought that the intention could be to prevent for negative inputs. I wrote two small programs to see the difference:

Specifier %d

#include <iostream>
#include <stdlib.h>
using namespace std;

int main() {
    const char* s = "-4";
    int value = -1;
    int res = sscanf(s, "%d", &value);
    cout << "value:" << value << endl;
    cout << "res:" << res << endl;
    return 0;
}

see also https://ideone.com/OR3IKN

Specifier %u

#include <iostream>
#include <stdlib.h>
using namespace std;

int main() {
    const char* s = "-4";
    int value = -1;
    int res = sscanf(s, "%u", &value);
    cout << "value:" << value << endl;
    cout << "res:" << res << endl;
    return 0;
}

see also https://ideone.com/WPWdqi

Result(s)

Surprisingly in both conversion specifiers accept the sign:

value:-4
res:1

I had a look into the documentation on cppreference.com. For C (scanf, fscanf, sscanf, scanf_s, fscanf_s, sscanf_s - cppreference.com) as well as C++ (std::scanf, std::fscanf, std::sscanf - cppreference.com) the description for the "%u" conversion specifier is the same (emphasis mine):

matches an unsigned decimal integer.
The format of the number is the same as expected by strtoul() with the value 10 for the base argument.

Is the observed behaviour standard complient? Where can I find this documented?

[Update] Undefined Behaviour, really, why?

I read that it was simply UB, well, to add to the confusion, here is the version declaring value as unsigned https://ideone.com/nNBkqN - I think the assignment of -1 is still as expected, but "%u" obviously still matches the sign:

#include <iostream>
#include <stdlib.h>

using namespace std;

int main() {
    const char* s = "-4";
    unsigned value = -1;
    cout << "value before:" << value << endl;
    int res = sscanf(s, "%u", &value);
    cout << "value after:" << value << endl;
    cout << "res:" << res << endl;
    return 0;
}

Result:

value before:4294967295
value after:4294967292
res:1
like image 631
Wolf Avatar asked Mar 09 '23 02:03

Wolf


1 Answers

There are two separate issues.

  1. %u expects a unsigned int* argument; passing a int* is UB.
  2. Does %u match -4? Yes. The expected format is that of strtoul with base 10, and if you read the documentation it's quite clear that a leading minus sign is allowed.
like image 117
T.C. Avatar answered Mar 14 '23 19:03

T.C.