Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

union versus void pointer

Tags:

c

unions

What would be the differences between using simply a void* as opposed to a union? Example:

struct my_struct {
    short datatype;
    void *data;
}

struct my_struct {
    short datatype;
    union {
        char* c;
        int* i;
        long* l;
    };
};

Both of those can be used to accomplish the exact same thing, is it better to use the union or the void* though?

like image 833
user105033 Avatar asked Nov 30 '09 18:11

user105033


People also ask

What is the difference between void and void pointer?

void* is a pointer (or a pointer to the beginning of a unknown type array). void* is a pointer to the address of a pointer (or a pointer to the beginning of a 2D array).

When would you use a void pointer?

We use the void pointers to overcome the issue of assigning separate values to different data types in a program. The pointer to void can be used in generic functions in C because it is capable of pointing to any data type.

Can we use pointer in Union?

You can use any data type in a union, there's no restriction.

What does the void pointer can be difference?

A void pointer is a pointer to incomplete type, so if either or both operands are void pointers, your code is not valid C. Note that GCC has a non-standard extension which allows void pointer arithmetic, by treating void pointers as pointer-to-byte for such cases.


3 Answers

I had exactly this case in our library. We had a generic string mapping module that could use different sizes for the index, 8, 16 or 32 bit (for historic reasons). So the code was full of code like this:

if(map->idxSiz == 1) 
   return ((BYTE *)map->idx)[Pos] = ...whatever
else
   if(map->idxSiz == 2) 
     return ((WORD *)map->idx)[Pos] = ...whatever
   else
     return ((LONG *)map->idx)[Pos] = ...whatever

There were 100 lines like that. As a first step, I changed it to a union and I found it to be more readable.

switch(map->idxSiz) {
  case 1: return map->idx.u8[Pos] = ...whatever
  case 2: return map->idx.u16[Pos] = ...whatever
  case 3: return map->idx.u32[Pos] = ...whatever
}

This allowed me to see more clearly what was going on. I could then decide to completely remove the idxSiz variants using only 32-bit indexes. But this was only possible once the code got more readable.

PS: That was only a minor part of our project which is about several 100’000 lines of code written by people who do not exist any more. The changes to the code have to be gradual, in order not to break the applications.

Conclusion: Even if people are less used to the union variant, I prefer it because it can make the code much lighter to read. On big projects, readability is extremely important, even if it is just you yourself, who will read the code later.

Edit: Added the comment, as comments do not format code:

The change to switch came before (this is now the real code as it was)

switch(this->IdxSiz) { 
  case 2: ((uint16_t*)this->iSort)[Pos-1] = (uint16_t)this->header.nUz; break; 
  case 4: ((uint32_t*)this->iSort)[Pos-1] = this->header.nUz; break; 
}

was changed to

switch(this->IdxSiz) { 
  case 2: this->iSort.u16[Pos-1] = this->header.nUz; break; 
  case 4: this->iSort.u32[Pos-1] = this->header.nUz; break; 
}

I shouldn't have combined all the beautification I did in the code and only show that step. But I posted my answer from home where I had no access to the code.

like image 113
Patrick Schlüter Avatar answered Oct 18 '22 04:10

Patrick Schlüter


In my opinion, the void pointer and explicit casting is the better way, because it is obvious for every seasoned C programmer what the intent is.

Edit to clarify: If I see the said union in a program, I would ask myself if the author wanted to restrict the types of the stored data. Perhaps some sanity checks are performed which make sense only on integral number types. But if I see a void pointer, I directly know that the author designed the data structure to hold arbitrary data. Thus I can use it for newly introduced structure types, too. Note that it could be that I cannot change the original code, e.g. if it is part of a 3rd party library.

like image 12
swegi Avatar answered Oct 18 '22 05:10

swegi


It's more common to use a union to hold actual objects rather than pointers.

I think most C developers that I respect would not bother to union different pointers together; if a general-purpose pointer is needed, just using void * certainly is "the C way". The language sacrifices a lot of safety in order to allow you to deliberately alias the types of things; considering what we have paid for this feature we might as well use it when it simplifies the code. That's why the escapes from strict typing have always been there.

like image 8
DigitalRoss Avatar answered Oct 18 '22 03:10

DigitalRoss