Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Looping over structure elements using pointers in C

I wrote this code to iterate over members of a structure. It works fine. Can I use similar method for structures with mixed type elements, i.e. some integers, some floats and ...?

#include <stdio.h>
#include <stdlib.h>

struct newData
{
    int x;
    int y;
    int z;
}  ;

int main()
{
    struct newData data1;
    data1.x = 10;
    data1.y = 20;
    data1.z = 30;

    struct newData *data2 = &data1;
    long int *addr = data2;
    for (int i=0; i<3; i++)
    {
        printf("%d \n", *(addr+i));
    }
}
like image 329
Iman H Avatar asked Oct 17 '18 16:10

Iman H


People also ask

Can we use pointer in structure in C?

Creating a pointer to structure in C is known as Structure to pointer in C. With arrow operator (->) and indirection(*) & dot operator(.), we can access the members of the structure using the structure pointer. Initialization of pointer is like initialization of a variable.

Can we use pointers in structures?

Pointer to structure holds the add of the entire structure. It is used to create complex data structures such as linked lists, trees, graphs and so on. The members of the structure can be accessed using a special operator called as an arrow operator ( -> ).

How do you access a structure array using pointer?

Pointers to Arrays The call to malloc allocates an array of whatever size you desire, and the pointer points to that array's first element. You can either index through the array pointed to by p using normal array indexing, or you can do it using pointer arithmetic. C sees both forms as equivalent.


2 Answers

In C, "it works fine" is not good enough. Because your compiler is allowed to do this:

struct newData
{
    int x;
    char padding1[523];
    int y;
    char padding2[364];
    int z;
    char padding3[251];
};

Of course, this is an extreme example. But you get the general idea; it's not guaranteed that your loop will work because it's not guaranteed that struct newData is equivalent to int[3].

So no, it's not possible in the general case because it's not always possible in the specific case!


Now, you might be thinking: "What idiots decided this?!" Well, I can't tell you that, but I can tell you why. Computers are very different to each other, and if you want code to run fast then the compiler has to be able to choose how to compile the code. Here's an example:

Processor 8 has an instruction to get individual bytes, and put them in a register:

GETBYTE addr, reg

This works well with this struct:

struct some_bytes {
   char age;
   char data;
   char stuff;
}

struct some_bytes can happily take up 3 bytes, and the code is fast. But what about Processor 16? It doesn't have GETBYTE, but it does have GETWORD:

GETWORD even_addr, reghl

This only accepts an even-numbered address, and reads two bytes; one into the "high" part of the register and one into the "low" part of the register. In order to make the code fast, the compiler has to do this:

struct some_bytes {
   char age;
   char pad1;
   char data;
   char pad2;
   char stuff;
   char pad3;
}

This means that the code can run faster, but it also means that your loop won't work. That's OK though, because it's something called "Undefined Behaviour"; the compiler is allowed to assume that it'll never happen, and if it does happen the behaviour is undefined.

In fact, you've already run across this behaviour! Your particular compiler was doing this:

struct newData
{
    int x;
    int pad1;
    int y;
    int pad2;
    int z;
    int pad3;
};

Because your particular compiler defines long int as twice the length of int, you were able to do this:

|  x  | pad |  y  | pad |  z  | pad |

| long no.1 | long no.2 | long no.3 |
| int |     | int |     | int |     

That code is, as you can tell by my precarious diagram, precarious. It probably won't work anywhere else. What's worse, your compiler, if it was being clever, would be able to do this:

for (int i=0; i<3; i++)
{
    printf("%d \n", *(addr+i));
}

Hmm... addr is from data2 which is from data1 which is a pointer to a struct newData. The C specification says that only the pointer to the start of the struct will ever be dereferenced, so I can assume that i is always 0 in this loop!

for (int i=0; i<3 && i == 0; i++)
{
    printf("%d \n", *(addr+i));
}

That means it only runs once! Hooray!

printf("%d \n", *(addr + 0));

And all I need to compile is this:

int main()
{
    printf("%d \n", 10);
}

Wow, the programmer will be so pleased that I've managed to speed this code up so much!

You won't be pleased. In fact, you'll get unexpected behaviour, and won't be able to work out why. But you would be pleased if you had written code free of Undefined Behaviour, and your compiler had done something similar. So it stays.

like image 127
wizzwizz4 Avatar answered Sep 28 '22 12:09

wizzwizz4


You're invoking undefined behavior. Just because it appears to work doesn't mean it's valid.

Pointer arithmetic is only valid when the original and resulting point both point to the same array object (or one past the end of the array object). You have multiple distinct objects (even though they're members of the same struct), so a pointer to one can't legally be used to get a pointer to the other.

This is detailed in section 6.5.6p8 of the C standard:

When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P) ) and (P)-N (where N has the value n ) point to, respectively, the i+n-th and i−n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.

like image 27
dbush Avatar answered Sep 28 '22 11:09

dbush