Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is scanf's "regex" support a standard?

Tags:

c

gcc

scanf

Is scanf's "regex" support a standard? I can't find the answer anywhere.

This code works in gcc but not in Visual Studio:

scanf("%[^\n]",a);

It is a Visual Studio fault or a gcc extension ?

EDIT: Looks like VS works, but have to consider the difference in line ends between Linux and Windows.(\r\n)

like image 847
bratao Avatar asked May 14 '11 00:05

bratao


1 Answers

That particular format string should work fine in a conforming implementation. The [ character introduces a scanset for matching a non-empty set of characters (with the ^ meaning that the scanset is an inversion of the characters supplied). In other words, the format specifier %[^\n] should match every character that's not a newline.

From C99 7.19.6.2, slightly paraphrased:

The [ format specifier matches a nonempty sequence of characters from a set of expected characters (the scanset). If no l length modifier is present, the corresponding argument shall be a pointer to the initial element of a character array large enough to accept the sequence and a terminating null character, which will be added automatically.

If an l length modifier is present, the input shall be a sequence of multibyte characters that begins in the initial shift state. Each multibyte character is converted to a wide character as if by a call to the mbrtowc function, with the conversion state described by an mbstate_t object initialized to zero before the first multibyte character is converted. The corresponding argument shall be a pointer to the initial element of an array of wchar_t large enough to accept the sequence and the terminating null wide character, which will be added automatically.

The conversion specifier includes all subsequent characters in the format string, up to and including the matching right bracket ]. The characters between the brackets (the scanlist) compose the scanset, unless the character after the left bracket is a circumflex ^, in which case the scanset contains all characters that do not appear in the scanlist between the circumflex and the right bracket. If the conversion specifier begins with [] or [^], the right bracket character is in the scanlist and the next following right bracket character is the matching right bracket that ends the specification; otherwise the first following right bracket character is the one that ends the specification. If a - character is in the scanlist and is not the first, nor the second where the first character is a ^, nor the last character, the behavior is implementation-defined.

It's possible, if MSVC isn't working correctly, that this is just one of the many examples where Microsoft either don't conform to the latest standard, or think they know better :-)

like image 93
paxdiablo Avatar answered Oct 07 '22 02:10

paxdiablo