Let's say that you were to have a structure similar to the following:
struct Person {
int gender; // betwwen 0-1
int age; // between 0-200
int birthmonth; // between 0-11
int birthday; // between 1-31
int birthdayofweek; // between 0-6
}
In terms of performance, which would be the best data type to store each of the fields? (e.g. bitfield, int
, char
, etc.)
It will be used on an x86 processor and stored entirely in RAM. A fairly large number will need to be stored (50,000+), so processor caches and such will need to be taken into account.
Edit: OK, let me rephrase the question. If memory usage is not important, and the entire dataset will not fit into the cache no matter which datatypes are used, is it generally better to use smaller datatypes to fit more of the data into the CPU cache, or is it better to use larger datatypes to allow the CPU to perform faster operations? I am asking this for reference only, so code readability and such should not be considered.
Ints are generally faster since CPUs are more efficient working with 32bit values. Bytes would primarily be used to reduce memory. Working with large files, streaming data etc.
byte datatype has a range from -128 to 127 and it requires very little memory (only 1 byte). It can be used in place of int where we are sure that the range will be very small.
So the instructions to access a bit are a superset of the instructions to access a byte. I would also add that it's safe to assume that byte access is faster than bit access simply because assuming the reverse can be detrimental if you're optimizing for speed. In general, worst case is that they're the same speed.
Thus, operations on char/short/int are generally equally fast, as the former ones are promoted to the latter.
In general, I would stay stick with ints... except for gender which should probably be an enum.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With