Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

fixed length data types in C/C++

Tags:

c++

c

People also ask

What are the 4 data types in C?

The C language provides the four basic arithmetic type specifiers char, int, float and double, and the modifiers signed, unsigned, short, and long. The following table lists the permissible combinations in specifying a large set of storage size-specific declarations.

What are the 5 data types in C?

Most of the time, for small programs, we use the basic fundamental data types in C – int, char, float, and double. For more complex and huge amounts of data, we use derived types – array, structure, union, and pointer. Enumeration and void consist of enum and void, respectively.

What is long data type C?

long int Data Type: In C, the long int data type occupies 4 bytes (32 bits) of memory to store an integer value. long int or signed long int data type denotes a 32 – bit signed integer that can hold any value between -2,147,483,648 (-2 31) and 2,147,483,647 (2 31 -1).


I know people solve this issue with some typedefs, like you have variables like u8,u16,u32 - which are guaranteed to be 8bits, 16bits, 32bits, regardless of the platform

There are some platforms, which have no types of certain size (like for example TI's 28xxx, where size of char is 16 bits). In such cases, it is not possible to have an 8-bit type (unless you really want it, but that may introduce performance hit).

how is this achieved usually?

Usually with typedefs. c99 (and c++11) have these typedefs in a header. So, just use them.

can someone bring some example, what goes wrong, when program assumes an int is 4 bytes, but on a different platform it is say 2 bytes?

The best example is a communication between systems with different type size. Sending array of ints from one to another platform, where sizeof(int) is different on two, one has to take extreme care.

Also, saving array of ints in a binary file on 32-bit platform, and reinterpreting it on a 64-bit platform.


In earlier iterations of the C standard, you generally made your own typedef statements to ensure you got a (for example) 16-bit type, based on #define strings passed into the compiler for example:

gcc -DINT16_IS_LONG ...

Nowadays (C99 and above), there are specific types such as uint16_t, the exactly 16-bit wide unsigned integer.

Provided you include stdint.h, you get exact bit width types,at-least-that-width types, fastest types with a given minimum widthand so on, as documented in C99 7.18 Integer types <stdint.h>. If an implementation has compatible types, they are required to provide these.

Also very useful is inttypes.h which adds some other neat features for format conversion of these new types (printf and scanf format strings).


For the first question: Integer Overflow.

For the second question: for example, to typedef an unsigned 32 bits integer, on a platform where int is 4 bytes, use:

 typedef unsigned int u32;

On a platform where int is 2 bytes while long is 4 bytes:

typedef unsigned long u32;

In this way, you only need to modify one header file to make the types cross-platform.

If there are some platform-specific macros, this can be achieved without modifying manually:

#if defined(PLAT1)
typedef unsigned int u32;
#elif defined(PLAT2)
typedef unsigned long u32;
#endif

If C99 stdint.h is supported, it's preferred.


First of all: Never write programs that rely on the width of types like short, int, unsigned int,....

Basically: "never rely on the width, if it isn't guaranteed by the standard".

If you want to be truly platform independent and store e.g. the value 33000 as a signed integer, you can't just assume that an int will hold it. An int has at least the range -32767 to 32767 or -32768 to 32767 (depending on ones/twos complement). That's just not enough, even though it usually is 32bits and therefore capable of storing 33000. For this value you definitively need a >16bit type, hence you simply choose int32_t or int64_t. If this type doesn't exist, the compiler will tell you the error, but it won't be a silent mistake.

Second: C++11 provides a standard header for fixed width integer types. None of these are guaranteed to exist on your platform, but when they exists, they are guaranteed to be of the exact width. See this article on cppreference.com for a reference. The types are named in the format int[n]_t and uint[n]_t where n is 8, 16, 32 or 64. You'll need to include the header <cstdint>. The C header is of course <stdint.h>.


usually, the issue happens when you max out the number or when you're serializing. A less common scenario happens when someone makes an explicit size assumption.

In the first scenario:

int x = 32000;
int y = 32000;
int z = x+y;        // can cause overflow for 2 bytes, but not 4

In the second scenario,

struct header {
int magic;
int w;
int h;
};

then one goes to fwrite:

header h;
// fill in h
fwrite(&h, sizeof(h), 1, fp);

// this is all fine and good until one freads from an architecture with a different int size

In the third scenario:

int* x = new int[100];
char* buff = (char*)x;


// now try to change the 3rd element of x via buff assuming int size of 2
*((int*)(buff+2*2)) = 100;

// (of course, it's easy to fix this with sizeof(int))

If you're using a relatively new compiler, I would use uint8_t, int8_t, etc. in order to be assure of the type size.

In older compilers, typedef is usually defined on a per platform basis. For example, one may do:

 #ifdef _WIN32
      typedef unsigned char uint8_t;
      typedef unsigned short uint16_t;
      // and so on...
 #endif

In this way, there would be a header per platform that defines specifics of that platform.


I am curious manually, how can one enforce that some type is always say 32 bits regardless of the platform??

If you want your (modern) C++ program's compilation to fail if a given type is not the width you expect, add a static_assert somewhere. I'd add this around where the assumptions about the type's width are being made.

static_assert(sizeof(int) == 4, "Expected int to be four chars wide but it was not.");

chars on most commonly used platforms are 8 bits large, but not all platforms work this way.