What precisely does the %g printf specifier mean?

Tags:

The %g specifier doesn't seem to behave in the way that most sources document it as behaving.

According to most sources I've found, across multiple languages that use printf specifiers, the %g specifier is supposed to be equivalent to either %f or %e - whichever would produce shorter output for the provided value. For instance, at the time of writing this question, cplusplus.com says that the g specifier means:

Use the shortest representation: %e or %f

And the PHP manual says it means:

g - shorter of %e and %f.

And here's a Stack Overflow answer that claims that

%g uses the shortest representation.

And a Quora answer that claims that:

%g prints the number in the shortest of these two representations

But this behaviour isn't what I see in reality. If I compile and run this program (as C or C++ - it's a valid program with the same behaviour in both):

#include <stdio.h>

int main(void) {
    double x = 123456.0;
    printf("%e\n", x);
    printf("%f\n", x);
    printf("%g\n", x);
    printf("\n");

    double y = 1234567.0;
    printf("%e\n", y);
    printf("%f\n", y);
    printf("%g\n", y);
    return 0;
}

... then I see this output:

1.234560e+05
123456.000000
123456

1.234567e+06
1234567.000000
1.23457e+06

Clearly, the %g output doesn't quite match either the %e or %f output for either x or y above. What's more, it doesn't look like %g is minimising the output length either; y could've been formatted more succinctly if, like x, it had not been printed in scientific notation.

Are all of the sources I've quoted above lying to me?

I see identical or similar behaviour in other languages that support these format specifiers, perhaps because under the hood they call out to the printf family of C functions. For instance, I see this output in Python:

>>> print('%g' % 123456.0)
123456
>>> print('%g' % 1234567.0)
1.23457e+06

In PHP:

php > printf('%g', 123456.0);
123456
php > printf('%g', 1234567.0);
1.23457e+6

In Ruby:

irb(main):024:0* printf("%g\n", 123456.0)
123456
=> nil
irb(main):025:0> printf("%g\n", 1234567.0)
1.23457e+06
=> nil

What's the logic that governs this output?

321

asked Jan 12 '19 17:01

Mark Amery

1 Answers

This is the full description of the g/G specifier in the C11 standard:

A double argument representing a floating-point number is converted in style f or e (or in style F or E in the case of a G conversion specifier), depending on the value converted and the precision. Let P equal the precision if nonzero, 6 if the precision is omitted, or 1 if the precision is zero. Then, if a conversion with style E would have an exponent of X:

if P > X ≥ −4, the conversion is with style f (or F) and precision P − (X + 1).
otherwise, the conversion is with style e (or E) and precision P − 1.

Finally, unless the # flag is used, any trailing zeros are removed from the fractional portion of the result and the decimal-point character is removed if there is no fractional portion remaining.

A double argument representing an infinity or NaN is converted in the style of an f or F conversion specifier.

This behaviour is somewhat similar to simply using the shortest representation out of %f and %e, but not equivalent. There are two important differences:

Trailing zeros (and, potentially, the decimal point) get stripped when using %g, which can cause the output of a %g specifier to not exactly match what either %f or %e would've produced.
The decision about whether to use %f-style or %e-style formatting is made based purely upon the size of the exponent that would be needed in %e-style notation, and does not directly depend on which representation would be shorter. There are several scenarios in which this rule results in %g selecting the longer representation, like the one shown in the question where %g uses scientific notation even though this makes the output 4 characters longer than it needs to be.

In case the C standard's wording is hard to parse, the Python documentation provides another description of the same behaviour:

General format. For a given precision p >= 1, this rounds the number to p significant digits and then formats the result in either fixed-point format or in scientific notation, depending on its magnitude.
The precise rules are as follows: suppose that the result formatted with presentation type 'e' and precision p-1 would have exponent exp. Then if -4 <= exp < p, the number is formatted with presentation type 'f' and precision p-1-exp. Otherwise, the number is formatted with presentation type 'e' and precision p-1. In both cases insignificant trailing zeros are removed from the significand, and the decimal point is also removed if there are no remaining digits following it.

Positive and negative infinity, positive and negative zero, and nans, are formatted as inf, -inf, 0, -0 and nan respectively, regardless of the precision.
A precision of 0 is treated as equivalent to a precision of 1. The default precision is 6.

The many sources on the internet that claim that %g just picks the shortest out of %e and %f are simply wrong.

answered Oct 19 '22 01:10

Mark Amery

Related questions
                            
                                Cost of context switch between threads of same process, on Linux
                            
                                Is mutex needed to synchronize a simple flag between pthreads?
                            
                                Why does RegCloseKey exist (when CloseHandle seems to perform the same function)?
                            
                                What is the type of a bitfield?
                            
                                Prevent PLT (procedure linkage table) breakpoints in GDB
                            
                                Possible Memory Leak Valgrind in OSX El Capitan
                            
                                C11 anonymous structs via typedefs?
                            
                                Implementing the ls -al command in C
                            
                                What exactly do I lose when using extern "C" in C++? [duplicate]
                            
                                Behavior of sizeof on variable length arrays (C only)
                            
                                Is there a difference between the _Atomic type qualifier and type specifier?
                            
                                Are float inequalities guaranteed to be consistent
                            
                                Is long long a type in C?
                            
                                Is there a nice way of handling multi-line input with GNU readline?
                            
                                Is it possible to have a variadic function in C with no non-variadic parameter?
                            
                                How can i match each /proc/net/tcp entry to each opened socket?
                            
                                How to rotate an SSE/AVX vector
                            
                                Why prevent a file from opening as controlling terminal (with O_NOCTTY)?
                            
                                Running multiple concurrent GMainLoops
                            
                                What is the rationale for not including strdup in the C Standard?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What precisely does the %g printf specifier mean?

Tags:

c

language-agnostic

floating-point

printf

format-specifiers

Mark Amery

People also ask

1 Answers

Mark Amery

Recent Activity

Donate For Us