Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Type of integer literals not int by default?

I just answered this question, which asked why iterating until 10 billion in a for loop takes so much longer (the OP actually aborted it after 10 mins) than iterating until 1 billion:

for (i = 0; i < 10000000000; i++) 

Now my and many others' obvious answer was that it was due to the iteration variable being 32-bit (which never reaches 10 billion) and the loop getting an infinite loop.

But though I realized this problem, I still wonder what was really going on inside the compiler?

Since the literal was not appended with an L, it should IMHO be of type int, too, and therefore 32-bit. So due to overflow it should be a normal int inside the range to be reachable. To actually recognize that it cannot be reached from int, the compiler needs to know that it is 10 billion and therefore see it as a more-than-32-bit constant.

Does such a literal get promoted to a fitting (or at least implementation-defined) range (at least 64-bit, in this case) automatically, even if not appended an L and is this standard behaviour? Or is something different going on behind the scenes, like UB due to overflow (is integer overflow actually UB)? Some quotes from the Standard may be nice, if any.

Although the original question was C, I also appreciate C++ answers, if any different.

like image 608
Christian Rau Avatar asked Nov 13 '11 00:11

Christian Rau


People also ask

What is the default type for integer literals?

The default is +. any valid integer. Integer literals consist of an optional sign followed by a sequence of digits. Spaces and new line characters are not allowed in a literal except after the optional sign.

What are the types of integer literals?

They can be represented as: Decimal integer literals. Hexadecimal integer literals. Octal integer literals.

What is the default type of integer literals in Java?

One of the possible reasons that int is default literal is that using long could cause hard to detect errors in multi-threading application, as specified in JLS 17.7 Non-atomic Treatment of double and long.


2 Answers

As far as C++ is concerned:

C++11, [lex.icon] ¶2

The type of an integer literal is the first of the corresponding list in Table 6 in which its value can be represented.

And Table 6, for literals without suffixes and decimal constants, gives:

int long int long long int 

(interestingly, for hexadecimal or octal constants also unsigned types are allowed - but each one come after the corresponding signed one in the list)

So, it's clear that in that case the constant has been interpreted as a long int (or long long int if long int was too 32 bit).

Notice that "too big literals" should result in a compilation error:

A program is ill-formed if one of its translation units contains an integer literal that cannot be represented by any of the allowed types.

(ibidem, ¶3)

which is promptly seen in this sample, that reminds us that ideone.com uses 32 bit compilers.


I saw now that the question was about C... well, it's more or less the same:

C99, §6.4.4.1

The type of an integer constant is the first of the corresponding list in which its value can be represented.

list that is the same as in the C++ standard.


Addendum: both C99 and C++11 allow also the literals to be of "extended integer types" (i.e. other implementation-specific integer types) if everything else fails. (C++11, [lex.icon] ¶3; C99, §6.4.4.1 ¶5 after the table)

like image 157
Matteo Italia Avatar answered Sep 22 '22 11:09

Matteo Italia


From my draft of the C standard labeled ISO/IEC 9899:TC2 Committee Draft — May 6, 2005, the rules are remarkably similar to the C++ rules Matteo found:

5 The type of an integer constant is the first of the corresponding list in which its value can be represented.

Suffix      Decimal Constant          Octal or Hexadecimal Constant ------------------------------------------------------------------- none        int                       int             long int                  unsigned int             long long int             long int                                       unsigned long int                                       long long int                                       unsigned long long int  u or U      unsigned int              unsigned int             unsigned long int         unsigned long int             unsigned long long int    unsigned long long int  l or L      long int                  long int             long long int             unsigned long int                                       long long int                                       unsigned long long int Both u or U unsigned long int         unsigned long int and l or L  unsigned long long int    unsigned long long int  ll or LL    long long int             long long int                                       unsigned long long int  Both u or U unsigned long long int    unsigned long long int and ll or LL  
like image 41
sarnold Avatar answered Sep 26 '22 11:09

sarnold