I do not understand why the <code>sizeof</code> operator is producing the following results: <pre class="prettyprint"><code>sizeof( 2500000000 ) // => 8 (8 bytes). </code></pre> ... it returns 8, and when I do the following: <pre class="prettyprint"><code>sizeof( 1250000000 * 2 ) // => 4 (4 bytes). </code></pre> ... it returns 4, rather than 8 (which is what I expected). Can someone clarify how <code>sizeof</code> determines the size of an expression (or data type) and why in my specific case this is occurring? My best guess is that the <code>sizeof</code> operator is a compile-time operator. Bounty Question: Is there a run time operator that can evaluate these expressions and produce my expected output (without casting)?

<code>2500000000</code> doesn't fit in an <code>int</code>, so the compiler correctly interprets it as a <code>long</code> (or <code>long long</code>, or a type where it fits). <code>1250000000</code> does, and so does <code>2</code>. The parameter to <code>sizeof</code> isn't evaluated, so the compiler can't possibly know that the multiplication doesn't fit in an <code>int</code>, and so returns the size of an <code>int</code>. Also, even if the parameter was evaluated, you'd likely get an overflow (and undefined behavior), but probably still resulting in <code>4</code>. Here: <pre class="prettyprint"><code>#include <iostream> int main() { long long x = 1250000000 * 2; std::cout << x; } </code></pre> can you guess the output? If you think it's <code>2500000000</code>, you'd be wrong. The type of the expression <code>1250000000 * 2</code> is <code>int</code>, because the operands are <code>int</code> and <code>int</code> and multiplication isn't automagically promoted to a larger data type if it doesn't fit. http://ideone.com/4Adf97 So here, gcc says it's <code>-1794967296</code>, but it's undefined behavior, so that could be any number. This number does fit into an <code>int</code>. In addition, if you cast one of the operands to the expected type (much like you cast integers when dividing if you're looking for a non-integer result), you'll see this working: <pre class="prettyprint"><code>#include <iostream> int main() { long long x = (long long)1250000000 * 2; std::cout << x; } </code></pre> yields the correct <code>2500000000</code>.

[Edit: I did not notice, initially, that this was posted as both C and C++. I'm answering only with respect to C.] Answering your followup question, "Is there anyway to determine the amount of memory allocated to an expression or variable at run time?": well, not exactly. The problem is that this is not a very well formed question. "Expressions", in C-the-language (as opposed to some specific implementation), don't actually use any memory. (Specific implementations need some code and/or data memory to hold calculations, depending on how many results will fit into CPU registers and so on.) If an expression result is not stashed away in a variable, it simply vanishes (and the compiler can often omit the run-time code to calculate the never-saved result). The language doesn't give you a way to ask about something it doesn't assume exists, i.e., storage space for expressions. Variables, on the other hand, do occupy storage (memory). The declaration for a variable tells the compiler how much storage to set aside. Except for C99's Variable Length Arrays, though, the storage required is determined purely at compile time, not at run time. This is why <code>sizeof x</code> is generally a constant-expression: the compiler can (and in fact must) determine the value of <code>sizeof x</code> at compile time. C99's VLAs are a special exception to the rule: <pre class="prettyprint"><code>void f(int n) { char buf[n]; ... } </code></pre> The storage required for <code>buf</code> is not (in general) something the compiler can find at compile time, so <code>sizeof buf</code> is not a compile-time constant. In this case, <code>buf</code> actually is allocated at run time and its size is only determined then. So <code>sizeof buf</code> is a runtime-computed expression. For most cases, though, everything is sized up front, at compile time, and if an expression overflows at run-time, the behavior is undefined, implementation-defined, or well-defined depending on the type. Signed integer overflow, as in 2.5 billion multiplied by 2, when <code>INT_MAX</code> is just a little over 2.7 billion, results in "undefined behavior". Unsigned integers do modular arithmetic and thus allow you to calculate in GF(2k). If you want to make sure some calculation cannot overflow, that's something you have to calculate yourself, at run time. This is a big part of what makes multiprecision libraries (like gmp) hard to write in C—it's usually a lot easier, as well as faster, to code big parts of that in assembly and take advantage of known properties of the CPU (like overflow flags, or double-wide result-register-pairs).

In C, sizeof operator returns 8 bytes when passing 2.5m but 4 bytes when passing 1.25m * 2

Tags:

c

sizeof

I do not understand why the sizeof operator is producing the following results:

sizeof( 2500000000 ) // => 8 (8 bytes).

... it returns 8, and when I do the following:

sizeof( 1250000000 * 2 ) // => 4 (4 bytes).

... it returns 4, rather than 8 (which is what I expected). Can someone clarify how sizeof determines the size of an expression (or data type) and why in my specific case this is occurring?

My best guess is that the sizeof operator is a compile-time operator.

Bounty Question: Is there a run time operator that can evaluate these expressions and produce my expected output (without casting)?

556

asked May 30 '13 04:05

Jacob Pollack

2 Answers

2500000000 doesn't fit in an int, so the compiler correctly interprets it as a long (or long long, or a type where it fits). 1250000000 does, and so does 2. The parameter to sizeof isn't evaluated, so the compiler can't possibly know that the multiplication doesn't fit in an int, and so returns the size of an int.

Also, even if the parameter was evaluated, you'd likely get an overflow (and undefined behavior), but probably still resulting in 4.

Here:

#include <iostream> int main() {     long long x = 1250000000 * 2;     std::cout << x; }

can you guess the output? If you think it's 2500000000, you'd be wrong. The type of the expression 1250000000 * 2 is int, because the operands are int and int and multiplication isn't automagically promoted to a larger data type if it doesn't fit.

http://ideone.com/4Adf97

So here, gcc says it's -1794967296, but it's undefined behavior, so that could be any number. This number does fit into an int.

In addition, if you cast one of the operands to the expected type (much like you cast integers when dividing if you're looking for a non-integer result), you'll see this working:

#include <iostream> int main() {     long long x = (long long)1250000000 * 2;     std::cout << x; }

yields the correct 2500000000.

128

answered Sep 18 '22 16:09

Luchian Grigore

[Edit: I did not notice, initially, that this was posted as both C and C++. I'm answering only with respect to C.]

Answering your followup question, "Is there anyway to determine the amount of memory allocated to an expression or variable at run time?": well, not exactly. The problem is that this is not a very well formed question.

"Expressions", in C-the-language (as opposed to some specific implementation), don't actually use any memory. (Specific implementations need some code and/or data memory to hold calculations, depending on how many results will fit into CPU registers and so on.) If an expression result is not stashed away in a variable, it simply vanishes (and the compiler can often omit the run-time code to calculate the never-saved result). The language doesn't give you a way to ask about something it doesn't assume exists, i.e., storage space for expressions.

Variables, on the other hand, do occupy storage (memory). The declaration for a variable tells the compiler how much storage to set aside. Except for C99's Variable Length Arrays, though, the storage required is determined purely at compile time, not at run time. This is why sizeof x is generally a constant-expression: the compiler can (and in fact must) determine the value of sizeof x at compile time.

C99's VLAs are a special exception to the rule:

void f(int n) {     char buf[n];     ... }

The storage required for buf is not (in general) something the compiler can find at compile time, so sizeof buf is not a compile-time constant. In this case, buf actually is allocated at run time and its size is only determined then. So sizeof buf is a runtime-computed expression.

For most cases, though, everything is sized up front, at compile time, and if an expression overflows at run-time, the behavior is undefined, implementation-defined, or well-defined depending on the type. Signed integer overflow, as in 2.5 billion multiplied by 2, when INT_MAX is just a little over 2.7 billion, results in "undefined behavior". Unsigned integers do modular arithmetic and thus allow you to calculate in GF(2^k).

If you want to make sure some calculation cannot overflow, that's something you have to calculate yourself, at run time. This is a big part of what makes multiprecision libraries (like gmp) hard to write in C—it's usually a lot easier, as well as faster, to code big parts of that in assembly and take advantage of known properties of the CPU (like overflow flags, or double-wide result-register-pairs).

answered Sep 18 '22 16:09

torek

Related questions
                            
                                How to capture Control+D signal?
                            
                                Does Function pointer make the program slow?
                            
                                What if I don't write default in switch case?
                            
                                How to get the Enum Index value in C#
                            
                                Efficient integer compare function
                            
                                How to add a builtin function in a GCC plugin?
                            
                                Difference between rdtscp, rdtsc : memory and cpuid / rdtsc?
                            
                                Programming languages that compile into C/C++ source? [closed]
                            
                                How does this C program compile and run with two main functions?
                            
                                What exactly is va_end for? Is it always necessary to call it?
                            
                                Hide password input on terminal
                            
                                Why is -(-2147483648) = - 2147483648 in a 32-bit machine?
                            
                                How does the C preprocessor handle circular dependencies?
                            
                                Arithmetic bit-shift on a signed integer
                            
                                What is the purpose of epoll's edge triggered option?
                            
                                How to get absolute value from double - c-language
                            
                                In a GNU C macro envSet(name), what does (void) "" name mean?
                            
                                redirect stdout/stderr to a string
                            
                                Is void a data type in C?
                            
                                Why is argc an 'int' (rather than an 'unsigned int')?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With