I understand snprintf will return a negative value when "an encoding error occurs" But what is a simple example of such an "encoding error" that will produce that result? I'm working with gcc 10.2.0 C compiler, and I've tried malformed format specifiers, unreasonably large numbers for field length, and even null format strings. <ul> <li>Malformed format specifiers just get printed literally</li> <li>Unreasonably large numbers as length specifiers produce fatal errors</li> <li>Null format strings also produce fatal errors</li> </ul> This relates to repeatedly doing something like: <pre class="prettyprint"><code>length += snprintf(... </code></pre> to build up a formatted string. That might be safe if it is certain not to return a negative value. Advancing the buffer pointer by a negative length could cause it to go out of bounds. But I'm looking for a case where that would actually happen. If there is such a case then the added complexity of this may be warranted: <pre class="prettyprint"><code>length += result = snprintf(... </code></pre> So far I couldn't find a scenario where it would be worth adding complexity for a check of a value that the compiler may never produce. Maybe you can give a simple example of one.

<blockquote> What is an encoding error for sprintf that should return -1? </blockquote> On my machine, <code>"%ls"</code> did not like the <code>0xFFFF</code> - certainly an encoding error. <pre class="prettyprint"><code> char buf[42]; wchar_t s[] = { 0xFFFF,49,50,51,0 }; int i = snprintf(buf, sizeof buf, "<%ls>", s); printf("%d\n", i); </code></pre> Output <pre class="prettyprint"><code>-1 </code></pre> <hr> Below code returned -1, but not so much due to encoding error as for pathological format. <pre class="prettyprint"><code>#include <stdio.h> int main() { size_t n = 0xFFFFFFFFLLu + 1; char *fmt = malloc(n); if (fmt == NULL) { puts("OOM"); return -42; } memset(fmt, 'x', n); fmt[n - 1] = '\0'; char buf[42]; int i = snprintf(buf, sizeof buf, fmt); printf("%d %x\n", i, (unsigned) i); free(fmt); return 7; } </code></pre> Output <pre class="prettyprint"><code>-1 ffffffff </code></pre> <hr> I did get a surprising -1 when passing a too big a size, even though the <code>snprintf()</code> only needed 6 bytes. <pre class="prettyprint"><code> char buf[42]; int i = snprintf(buf, 4299195472, "Hello"); printf("%d\n", i); </code></pre> Output <pre class="prettyprint"><code>-1 </code></pre> <hr> I was able to come up with a short example returning -1 on a <code>*fprintf()</code> to <code>stdout</code> due to orientation conflict. <pre class="prettyprint"><code>#include <wchar.h> #include <stdio.h> int main() { int w = wprintf(L"Hello wide world\n"); wprintf(L"%d\n", w); int s = printf("Hello world\n"); wprintf(L"%d\n", s); } </code></pre> Output <pre class="prettyprint"><code>Hello wide world 17 -1 </code></pre>

Normally you only expect an error from printf and family when an output error occurs. From the Linux man page: <blockquote> If an output error is encountered, a negative value is returned. </blockquote> So if you are outputting to a FILE and an output error of some kind (EPIPE, EIO) occurs, you'll get a negative return value. For s[n]printf, since there's no output, there would never be a negative return value. The standard talks about the possibility of an "encoding error", but only defines what that means with respect to wide character streams, with a note that byte streams might need to convert to wide streams in some cases. <blockquote> An encoding error occurs if the character sequence presented to the underlying mbrtowc function does not form a valid (generalized) multibyte character, or if the code value passed to the underlying wcrtomb does not correspond to a valid (generalized) multibyte character. The wide character input/output functions and the byte input/output functions store the value of the macro EILSEQ in errno if and only if an encoding error occurs. </blockquote> That would seem to imply that you can get an encoding error if you use the <code>%ls</code> or <code>%lc</code> formats to convert a wide string or characters to bytes. Not sure if there are any other cases where it could occur.

What is an encoding error for sprintf that should return -1?

Tags:

c

error-handling

gcc

language-lawyer

I understand snprintf will return a negative value when "an encoding error occurs"

But what is a simple example of such an "encoding error" that will produce that result?

I'm working with gcc 10.2.0 C compiler, and I've tried malformed format specifiers, unreasonably large numbers for field length, and even null format strings.

Malformed format specifiers just get printed literally
Unreasonably large numbers as length specifiers produce fatal errors
Null format strings also produce fatal errors

This relates to repeatedly doing something like:

length += snprintf(...

to build up a formatted string.

That might be safe if it is certain not to return a negative value.

Advancing the buffer pointer by a negative length could cause it to go out of bounds. But I'm looking for a case where that would actually happen. If there is such a case then the added complexity of this may be warranted:

length += result = snprintf(...

So far I couldn't find a scenario where it would be worth adding complexity for a check of a value that the compiler may never produce. Maybe you can give a simple example of one.

704

asked Dec 17 '20 03:12

Ted Shaneyfelt

2 Answers

What is an encoding error for sprintf that should return -1?

On my machine, "%ls" did not like the 0xFFFF - certainly an encoding error.

  char buf[42];
  wchar_t s[] = { 0xFFFF,49,50,51,0 };
  int i = snprintf(buf, sizeof buf, "<%ls>", s);
  printf("%d\n", i);

Output

-1

Below code returned -1, but not so much due to encoding error as for pathological format.

#include <stdio.h>

int main() {
  size_t n = 0xFFFFFFFFLLu + 1;
  char *fmt = malloc(n);
  if (fmt == NULL) {
    puts("OOM");
    return -42;
  }
  memset(fmt, 'x', n);
  fmt[n - 1] = '\0';
  char buf[42];
  int i = snprintf(buf, sizeof buf, fmt);
  printf("%d %x\n", i, (unsigned) i);
  free(fmt);
  return 7;
}

Output

-1 ffffffff

I did get a surprising -1 when passing a too big a size, even though the snprintf() only needed 6 bytes.

  char buf[42];
  int i = snprintf(buf, 4299195472, "Hello");
  printf("%d\n", i);

Output

-1

I was able to come up with a short example returning -1 on a *fprintf() to stdout due to orientation conflict.

#include <wchar.h>
#include <stdio.h>

int main() {
  int w = wprintf(L"Hello wide world\n");
  wprintf(L"%d\n", w);
  int s = printf("Hello world\n");
  wprintf(L"%d\n", s);
}

Output

Hello wide world
17
-1

answered Oct 25 '22 08:10

chux - Reinstate Monica

Normally you only expect an error from printf and family when an output error occurs. From the Linux man page:

If an output error is encountered, a negative value is returned.

So if you are outputting to a FILE and an output error of some kind (EPIPE, EIO) occurs, you'll get a negative return value. For s[n]printf, since there's no output, there would never be a negative return value.

The standard talks about the possibility of an "encoding error", but only defines what that means with respect to wide character streams, with a note that byte streams might need to convert to wide streams in some cases.

An encoding error occurs if the character sequence presented to the underlying mbrtowc function does not form a valid (generalized) multibyte character, or if the code value passed to the underlying wcrtomb does not correspond to a valid (generalized) multibyte character. The wide character input/output functions and the byte input/output functions store the value of the macro EILSEQ in errno if and only if an encoding error occurs.

That would seem to imply that you can get an encoding error if you use the %ls or %lc formats to convert a wide string or characters to bytes. Not sure if there are any other cases where it could occur.

answered Oct 25 '22 07:10

Chris Dodd

Related questions
                            
                                Does this implementation of mutex locks result in undefined behavior?
                            
                                Why does GCC 9.1.0 sometimes complain about this use of strncpy()?
                            
                                librdkafka consumer and ssl configuration
                            
                                This small printf loop emits one extra byte seemingly out of nowhere, why?
                            
                                Enable/Disable Hardware Lock Elision
                            
                                Call C functions from Swift 4
                            
                                Macro for endian-independent conversion to big endian
                            
                                How to actually detect musl libc?
                            
                                In terms of using the stack, why do we need a base pointer and a stack pointer [duplicate]
                            
                                Why does fseek use read() system call?
                            
                                How people check nan and inf in C89
                            
                                Why is © (the copyright symbol) replaced with (C) when using wprintf?
                            
                                C "block" caret character
                            
                                Visualising C struct dependencies
                            
                                Do Unisys latest mainframe systems still use ones' complement representations?
                            
                                Why does `std::time` have an unnecessary parameter?
                            
                                Default argument and parameter promotions in C
                            
                                How does ftell affect a binary file being read in mode 'r' instead of 'rb'?
                            
                                valgrind asan runtime does not come first in initial library list
                            
                                read always read less octet than asked

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is an encoding error for sprintf that should return -1?

Tags:

c

error-handling

gcc

language-lawyer

Ted Shaneyfelt

People also ask

2 Answers

chux - Reinstate Monica

Chris Dodd

Recent Activity

Donate For Us