Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does `printf("%.-1s\n", "foo")` invoke undefined behaviour?

According to the standards :

Each conversion specification is introduced by the character %. After the %, the following appear in sequence:

  • Zero or more flags [...].
  • An optional minimum field width. [...]
  • An optional precision that gives [...] the maximum number of bytes to be written for s conversions. The precision takes the form of a period (.) followed by [...] an optional decimal integer;
  • An optional length modifier [...]. + A conversion specifier character [...].
  • An optional minimum field width. [...]
  • A conversion specifier character [...].

Later :

A negative precision argument is taken as if the precision were omitted.

What I would expect from printf("%.-1s\n", "foo") according to how I interpret the standard definition :

The second quote I took from the standard suggests that we could pass a negative precision argument and that such precision would be ignored.

So, printf("%.-1s\n", "foo") should be equivalent to printf("%s\n", "foo"), which would display "foo\n" and return 4.

Yet, here is the actual printf("%.-1s\n", "foo") behaviour on the system I use (osx) :

printf("%.-1s\n", "foo") displays " \n" and returns 2.

This is obviously different from what I was expecting.

  • Is somehow my interpretation of the standards wrong?
  • Is this behaviour undefined?
  • Is passing a negative precision (edit: without asterisk) actually possible?
like image 772
vmonteco Avatar asked Jun 19 '17 19:06

vmonteco


2 Answers

N1570-§7.21.6.1/p5:

As noted above, a field width, or precision, or both, may be indicated by an asterisk. In this case, an int argument supplies the field width or precision. The arguments specifying field width, or precision, or both, shall appear (in that order) before the argument (if any) to be converted. A negative field width argument is taken as a - flag followed by a positive field width. A negative precision argument is taken as if the precision were omitted.

Standard specifies that this is applicable only when asterisk is used as precision in the format string and a negative value is passed as an argument as given below

printf("%.*s\n", -1, "foo");  // -1 will be ignored  

In the 4th para it says:

[...] The precision takes the form of a period (.) followed either by an asterisk * (described later) or by an optional decimal integer; [...]

but it doesn't specifically says whether decimal integer should be greater than 0 (as it says in case of field width of scanf in the section 7.21.6.2/p3). Standard seem ambiguous at this point and result may be machine dependent.

like image 84
haccks Avatar answered Oct 21 '22 08:10

haccks


  • Is somehow my interpretation of the standards wrong?

I take your interpretation to be summed up by this:

So, printf("%.-1s\n", "foo") should be equivalent to printf("%s\n", "foo"), which would display "foo\n" and return 4.

No. The provision you quote about negative precision arguments being ignored does not apply to this case. That provision is talking about the option of specifying the precision as * in the format string, and passing the value as a separate printf argument:

printf("%.*s\n", -1, "foo");

In that case, the negative precision argument causes printf() to behave as if no precision was specified. Your case is different.

On the other hand, the standard does not here require the precision value appearing in the format string to be a nonnegative decimal integer. It does qualify the term "decimal integer" that way in several other places, including earlier in the same section, but it does not do so in the paragraph about the precision field.

  • Is this behaviour undefined?

No. There are two conflicting interpretations of the required semantics (see next), but either way, the standard defines the behavior. It could be interpreted either as

  • the behavior described for a negative precision argument also applies when a negative precision value is presented directly in the format string. This has the advantage of consistency, and it is the behavior you report observing. However,

  • a literal reading of the standard would indicate that when the precision is presented as a negative decimal integer in the format string, then the ordinary semantics described in that section apply; for s directives, that would be that the negative precision expresses the maximum number of characters to be output.

The behavior you observe is not consistent with the former interpretation, but given the practical difficulties in outputting fewer than 0 bytes, it comes as little surprise to me that the latter interpretation is not successfully implemented. I'm inclined to guess that the latter is what your implementation is trying to implement.

I suspect that it was an unintentional omission at some stage to leave open the possibility of providing a negative value for the precision field, but intentional or not, the standard seems to allow it.

like image 24
John Bollinger Avatar answered Oct 21 '22 10:10

John Bollinger