Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What happens if I forget to close a scanset?

Suppose I forgot to close the right square bracket ] of a scanset. What will happen then? Does it invoke Undefined Behavior?

Example:

char str[] = "Hello! One Two Three";
char s1[50] = {0}, s2[50] = {0};
sscanf(str, "%s %[^h", s1, s2); /* UB? */
printf("s1='%s' s2='%s'\n", s1, s2);

I get a warning from GCC when compiling:

source_file.c: In function ‘main’:
source_file.c:11:5: warning: no closing ‘]’ for ‘%[’ format [-Wformat=]
     sscanf(str, "%s %[^h", s1, s2); /* UB? */

and the output as

s1='Hello!' s2=''

I've also noticed that the sscanf returns 1. But what exactly is going on here?

I've checked the C11 standard, but found no information related to this.

like image 974
Spikatrix Avatar asked Jan 30 '16 13:01

Spikatrix


People also ask

What is Scanset character in C?

The scanset is basically a specifier supported by scanf family functions. It is represented by %[]. Inside scanset we can specify only one character or a set of characters (Case Sensitive). When the scanset is processed, the scanf() can process only those characters which are mentioned in the scanset.

How do you use a Scanset?

We can define scanset by putting characters inside square brackets. Please note that the scansets are case-sensitive. We can also use scanset by providing comma in between the character you want to add. example: scanf(%s[A-Z,_,a,b,c]s,str);

What does %* mean in C?

When passed as part of a `scanf` format string, “%*c” means “read and ignore a character”. There has to be a character there for the conversion to succeed, but other than that, the character is ignored. A typical use-case would be reading up to some delimiter, then ignoring the delimiter. For example: char s[20];

How do I Scanf a string in C++?

Just use scanf("%s", stringName); or cin >> stringName; tip: If you want to store the length of the string while you scan the string, use this : scanf("%s %n", stringName, &stringLength); stringName is a character array/string and strigLength is an integer.


1 Answers

Excellent! You should file a defect report for C11!

Here is the relevant part in C11 7.21.6.2

... The conversion specifier includes all subsequent characters in the format string, up to and including the matching right bracket (]). The characters between the brackets (the scanlist) compose the scanset, unless the character after the left bracket is a circumflex (^), in which case the scanset contains all characters that do not appear in the scanlist between the circumflex and the right bracket.

A strict interpretation of The characters between the brackets is that in the absence of a closing bracket there are no such characters, but in the presence of ^ as the first character after [, it would be inconsistent. gcc is kind enough to point the probable error in the source code. The actual behavior is determined by the C library implementation, but does not seem to be specified in the C Standard. As such it is a form of undefined behavior that IMHO should really be documented as such in the Standard.

like image 88
chqrlie Avatar answered Nov 06 '22 04:11

chqrlie