It is very common that even in script where the developer have guarantees that the variable will never exceed one byte and sometimes two bytes;
Many people decide to use int
types for every possible variable used to represent numbers nay in the range of 0-1.
Why does it hurt so much to use char
or short
instead?
I think I heard someone saying int
is "more standard" type of.. type.
What does this mean. My question is does the data type int
have any defined advantages over short
(or other lesser data types), because of which advantages, people used to almost always resort to int
?
Shorting stocks is a way to profit from falling stock prices. A fundamental problem with short selling is the potential for unlimited losses. Shorting is typically done using margin and these margin loans come with interest charges, which you have pay for as long as the position is in place.
For the typical investor with a long-term investment horizon, buying stocks is a less risky proposition than short selling. Short selling may only make sense in certain situations, such as in a protracted bear market or if a company is experiencing financial difficulties.
Shorting, or short-selling, is when an investor borrows shares and immediately sells them, hoping he or she can scoop them up later at a lower price, return them to the lender and pocket the difference. But shorting is much riskier than buying stocks, or what's known as taking a long position.
Pros and Cons of Short SellingSelling short can be costly if the seller guesses wrong about the price movement. A trader who has bought stock can only lose 100% of their outlay if the stock moves to zero. However, a trader who has shorted stock can lose much more than 100% of their original investment.
As a general rule, most arithmetic in C is performed using type int
(that is, plain int
, not short
or long
). This is because (a) the definition of C says so, which is related to the fact that (b) that's the way many processors (at least, the ones C's designers had in mind) prefer to work.
So if you try to "save space" by using short
ints instead, and you write something like
short a = 1, b = 2;
short c = a + b;
the compiler has to emit code to, in effect, convert a
from short
to int
, convert b
from short
to int
, do the addition, and convert the sum back to short
. You may have saved a little bit of space on the storage for a
, b
, and c
, but your code is likely to be bigger (and slower).
If you instead write
int a = 1, b = 2;
int c = a + b;
you spend a little more storage space in a
, b
, and c
, but the code is probably smaller and quicker.
This is somewhat of an oversimplified argument, but it's behind your observation that usage of type short
is rare, and plain int
is generally recommended. Basically, since it's the machine's "natural" size, it's presumed to be the most straightforward type to do arithmetic in, without extra conversions to and from less-natural types. It's sort of a "When in Rome, do as the Romans do" argument, but it generally does make using plain int
advantageous.
If you have lots of not-so-large integers to store, on the other hand (a large array of them, or a large array of structures containing not-so-large integers), the storage savings for the data might be large, and worth it as traded off against the (relatively smaller) increase in the code size, and the potential speed increase.
See also this previous SO question and this C FAQ list entry.
Addendum: like any optimization problem, if you really care about data space usage, code space usage, and code speed, you'll want to perform careful measurements using your exact machine and processor. Your processor might not end up requiring any "extra conversion instructions" to convert to/from the smaller types, after all, so using them might not be so much of a disadvantage. But at the same time you can probably confirm that, for isolated variables, using them might not yield any measurable advantage, either.
Addendum 2. Here's a data point. I experimented with the code
extern short a, b, c;
void f()
{
c = a + b;
}
I compiled with two compilers, gcc and clang (compiling for an Intel processor on a Mac). I then changed short
to int
and compiled again. The int
-using code was 7 bytes smaller under gcc, and 10 bytes smaller under clang. Inspection of the assembly language output suggests that the difference was in truncating the result so as to store it in c
; fetching short
as opposed to int
doesn't seem to change the instruction count.
However, I then tried calling the two different versions, and discovered that it made virtually no difference in the run time, even after 10000000000 calls. So the "using short
might make the code bigger" part of the answer is confirmed, but maybe not "and also make it slower".
I was skeptical about the claim that short-based code should be slower and bigger in any significant way (assuming local variables here, no disputes about large arrays where short
s definitely do pay off if appropriate), so I tried to benchark it on my Intel(R) Core(TM) i5 CPU M 430 @ 2.27GHz
I used (long.c):
long long_f(long A, long B)
{
//made up func w/ a couple of integer ops
//to offset func-call overhead
long r=0;
for(long i=0;i<10;i++){
A=3*A*A;
B=4*B*B*B;
r=A+B;
}
return r;
}
in a long
, int
, and short
-based version (%s/long/TYPE/g
), built the program with gcc
and clang
in -O3
and -Os
and measured sizes and runtimes for 100mil invocations of each of these functions.
f.h:
#pragma once
int int_f(int A, int B);
short short_f(short A, short B);
long long_f(long A, long B);
main.c:
#include "f.h"
#include <stdlib.h>
#include <stdio.h>
#define CNT 100000000
int main(int C, char **V)
{
int choose = atoi(V[1]?:"0");
switch(choose){
case 0:
puts("short");
for(int i=0; i<CNT;i++)
short_f(1,2);
break;
case 1:
puts("int");
for(int i=0; i<CNT;i++)
int_f(1,2);
break;
default:
puts("long");
for(int i=0; i<CNT;i++)
long_f(1,2);
}
}
build:
#!/bin/sh -eu
time(){ command time -o /dev/stdout "$@"; }
for cc in gcc clang; do
$cc -Os short.c -c
$cc -Os int.c -c
$cc -Os long.c -c
size short.o int.o long.o
$cc main.c short.o int.o long.o
echo $cc -Os
time ./a.out 2
time ./a.out 1
time ./a.out 0
$cc -O3 short.c -c
$cc -O3 int.c -c
$cc -O3 long.c -c
size short.o int.o long.o
$cc main.c short.o int.o long.o
echo $cc -O3
time ./a.out 2
time ./a.out 1
time ./a.out 0
done
I did it twice, the and the results appear to be stable.
text data bss dec hex filename
79 0 0 79 4f short.o
80 0 0 80 50 int.o
87 0 0 87 57 long.o
gcc -Os
long
3.85user 0.00system 0:03.85elapsed 99%CPU (0avgtext+0avgdata 1272maxresident)k
0inputs+0outputs (0major+73minor)pagefaults 0swaps
int
4.78user 0.00system 0:04.78elapsed 99%CPU (0avgtext+0avgdata 1220maxresident)k
0inputs+0outputs (0major+74minor)pagefaults 0swaps
short
3.36user 0.00system 0:03.36elapsed 99%CPU (0avgtext+0avgdata 1328maxresident)k
0inputs+0outputs (0major+74minor)pagefaults 0swaps
text data bss dec hex filename
137 0 0 137 89 short.o
109 0 0 109 6d int.o
292 0 0 292 124 long.o
gcc -O3
long
3.90user 0.00system 0:03.90elapsed 99%CPU (0avgtext+0avgdata 1220maxresident)k
0inputs+0outputs (0major+74minor)pagefaults 0swaps
int
1.22user 0.00system 0:01.22elapsed 99%CPU (0avgtext+0avgdata 1260maxresident)k
0inputs+0outputs (0major+73minor)pagefaults 0swaps
short
1.62user 0.00system 0:01.62elapsed 99%CPU (0avgtext+0avgdata 1228maxresident)k
0inputs+0outputs (0major+73minor)pagefaults 0swaps
text data bss dec hex filename
83 0 0 83 53 short.o
79 0 0 79 4f int.o
88 0 0 88 58 long.o
clang -Os
long
3.33user 0.00system 0:03.33elapsed 99%CPU (0avgtext+0avgdata 1316maxresident)k
0inputs+0outputs (0major+71minor)pagefaults 0swaps
int
3.02user 0.00system 0:03.03elapsed 99%CPU (0avgtext+0avgdata 1316maxresident)k
0inputs+0outputs (0major+71minor)pagefaults 0swaps
short
5.27user 0.00system 0:05.28elapsed 99%CPU (0avgtext+0avgdata 1236maxresident)k
0inputs+0outputs (0major+69minor)pagefaults 0swaps
text data bss dec hex filename
110 0 0 110 6e short.o
219 0 0 219 db int.o
279 0 0 279 117 long.o
clang -O3
long
3.57user 0.00system 0:03.57elapsed 99%CPU (0avgtext+0avgdata 1228maxresident)k
0inputs+0outputs (0major+69minor)pagefaults 0swaps
int
2.86user 0.00system 0:02.87elapsed 99%CPU (0avgtext+0avgdata 1228maxresident)k
0inputs+0outputs (0major+68minor)pagefaults 0swaps
short
1.38user 0.00system 0:01.38elapsed 99%CPU (0avgtext+0avgdata 1204maxresident)k
0inputs+0outputs (0major+70minor)pagefaults 0swaps
The results are fairly close and yet they relatively vary quite widely with different compilers and compiler settings.
My conclusion is that choosing between int
and short
s in a function body or signature (arrays are a different issue) because one should perform better than the other or generate denser code is mostly futile (at least in code that isn't fixed to a specific compiler with specific settings). Either is fast, so I'd choose whichever type fits the semantics of my program better or communicates my API better
(If I'm expecting a short positive value, might as well use a uchar or ushort in the signature.)
C programmers are predisposed to use int
s because C has favored them historically (integer literals tend to be int
s, promotions tend to make int
s, there used to be implicit int rules for declarations and undeclared functions, etc.) and int
s are supposed to be a good fit for the architecture, but at the end of the day, dense, performant machine code with a readable, maintainable source is what matters and if your theory for doing something in the source code doesn't demonstrably contribute towards at least one of these goals, I think it's a bad theory.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With