I'm working in a C++/Qt project for Embedded Linux where we are constantly "duelling" against the limitations of our processor specially when it comes to updating the graphs in the user interface. Thanks to those limitations (and specially our situation some time ago when things were even worse), I try to optimize the code always when I can and if the costs of optimization are minimum. One of such optimizations I was doing is to always use the correct integer value for the situation I'm handling: qint8, qint16 and qint32 depending on how big is the value I need.
But some time ago I read somewhere that instead of trying to use the minimal size of integer when possible, I should always prefer to use the integer value related to the capacity of my processor, that is, if my processor is 32-bit oriented, then I should prefer to use qint32 always even when such a big integer wasn't required. In a first moment I couldn't understand why, but the answer to this question suggest that is because the performance of the processor is greater when it has to work with its "default size of integer".
Well I'm not convinced. First of all no actual reference was provided confirming such a thesis: I just can't understand why writing and reading from a 32-bit memory space would be slower then doing it with 32 bit integer (and the explanation given wasn't much comprehensible, btw). Second there are some moments on my app when I need to transfer data from one side to the other such as when using Qt's signals and slots mechanism. Since I'm transferring data from one point to the other shouldn't smaller data always give an improvement over bigger data? I mean a signal sending two chars (not by reference) isn't supposed to do the work quicker then sending two 32 bit integers?
In fact, while the "processor explanation" suggests using the characteristics of your processor, other cases suggests the opposite. For example, when dealing with databases, this and this threads both suggests that there is an advantage (even if just in some cases) in using smaller versions of integer.
So, after all, should I prefer to use small types of int when the context allows or not? Or is there a list of cases when one approach or the other is more likely to give better or worst results? (e.g. I should use int8 and int16 when using databases but the default type of my processor in all other situations)
And as a last question: Qt normally have int-based implemenations of its functions. In such cases, doesn't the cast operation annihilates any possible improvement that one could have by using minor integers?
Originally Answered: Is using of int32_t prefered to int in modern C/C++ programming? If you need a signed integer with exactly 32 bits, then yes. Otherwise, no.
Using short can conserve memory if it is narrower than int , which can be important when using a large array. Your program will use more memory in a 32-bit int system compared to a 16-bit int system.
int is the integer type which offers the fastest processing speeds. The initial (default) value for integers is 0 , and for floats this is 0.0 A float32 is reliably accurate to about 7 decimal places, a float64 to about 15 decimal places.
Thus, operations on char/short/int are generally equally fast, as the former ones are promoted to the latter.
This question is really too broad without specifying a specific CPU. Because some 32 bit CPUs have plenty of instructions for handling smaller types, some don't. Some 32 bit CPUs handle misaligned access smoothly, some produce slower code because of it, and some halt and catch fire when they encounter it.
That being said, first of all there is the case of standard integer promotion present in every C and C++ program, which will implicitly convert all small integer types you use into int
.
The compiler is free to use integer promotion as specified in the standard, or to optimize it away, whichever leads to the most effective code, as long as the results are the same as for non-optimized code.
Implicit promotion may create more effective code but it may also create subtle, disastrous bugs with unexpected type and signedness changes, if the programmer is not aware of how the various implicit type promotion rules work. Sadly, plenty of would-be C and C++ programmers are not. When using smaller integer types, you need to be a much more competent/awake programmer than if you just use 32 bit sized variables all over the place.
So if you are reading this but have never heard of the integer promotion rules or the usual arithmetic conversions/balancing, then I would strongly suggest that you immediately stop any attempt of manually optimizing integer sizes and go read up on those implicit promotion rules instead.
If you are aware of all implicit promotion rules, then you can do manual optimization by using smaller integer types. But use the ones which gives the compiler most flexibility. Those are:
#include <stdint.h>
int_fast8_t
int_fast16_t
uint_fast8_t
uint_fast16_t
When these types are used, the compiler is free to change them for a larger type if that would yield faster code.
The difference between the above variables just relying on integer promotion/expression optimization, is that with the fast types the compiler can not only decide which type suits the CPU registers best for a given calculation, but also decide memory consumption and alignment when the variables are allocated.
One strong argument against using small variables is that when mapped to registers (assuming they're not expanded implicitly), they may cause unintended false dependencies if your ISA uses partial registers. Such is the case with x86, as some old programs still employ AH or AX and their counterparts as 8/16 bit sizes registers. If your register has some value stuck in the upper part (due to a previous write to the full register), your CPU may be forced to carry it along and merge it with any partial value you calculate to maintain correctness, causing serial chains of dependencies even if your calculations were independent.
The memory claim raised by the answer you linked also holds, although I find it a bit weaker - it's true that the memory subsystems usually work in full cache line granularity (that's often 64 bytes these days), and then rotate and mask, but that alone should not cause a performance impact - if anything it improves performance when your data access patterns exhibit spatial locality. Smaller variables may in some cases also increase the risk of causing alignment issues, especially if you pack variables of different sizes closely, but most compilers should know better than that (unless explicitly forced not to).
I think the main problem with small variables over memory, would be again - increasing the chances of false dependency - the merging is done implicitly by the memory system, but if other cores (or sockets) are sharing some of you variables, you're running the risk of knocking the entire line out of the cache)
In general, there is little use in too early optimization. For local variables and smaller classes and structs, there is little to no gain in using non-native types. Depending on the procedure call standard, packing/unpacking smaller types into a single register might even add more code than the word-size types cost.
For larger arrays, list/tree nodes (IOW: larger data structures), however, things can be different. It might be worth here to use appropriate types, not the natural, use C-stype structs without methods, etc. For most "modern" (since the end of the last century) Linux-compatible architecture, there is mostly no penalty for smaller integer types. For float types, there might be architectures only supporting float, not double by hardware or double takes longer to process. For these, using the smaller type does not just reduce memory footprint, but is also faster.
Instead of reducing the types of members/variables, it is worth to optimize the class hierarchy or even use native C code (or C-style coding) for some parts. Things like virtual methods or RTTI can be pretty costly. The former uses large jump-tables, the latter adds descriptors for each class.
Note that some statements assume code and data to reside in RAM as typical for (embedded) Linux systems. If code/constants are stored in Flash e.g., you have to sort the statements by the impact on the respective memory type.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With