I recently noticed that in C, there is an important difference between array
and &array
for the following declaration:
char array[] = {4, 8, 15, 16, 23, 42};
The former is a pointer to a char while the latter is a pointer to an array of 6 chars. Also it is notable that the writing a[b]
is a syntactic sugar for *(a + b)
. Indeed, you could write 2[array]
and it works perfectly according to the standard.
So we could take advantage of this information to write this:
char last_element = (&array)[1][-1];
&array
has a size of 6 chars so (&array)[1])
is a pointer to chars located right after the array. By looking at [-1]
I am therefore accessing the last element.
With this I could for example swap the entire array :
void swap(char *a, char *b) { *a ^= *b; *b ^= *a; *a ^= *b; }
int main() {
char u[] = {1,2,3,4,5,6,7,8,9,10};
for (int i = 0; i < sizeof(u) / 2; i++)
swap(&u[i], &(&u)[1][-i - 1]);
}
Does this method for accessing an array by the end have flaws?
C arrays don't have an end marker. It is your responsibility as the programmer to keep track of the allocated size of the array to make sure you don't try to access element outside the allocated size. If you do access an element outside the allocated size, the result is undefined behaviour.
It does, however, allow a pointer to point at one element beyond the end of the array. The distinction is important. Thus, this is OK: char array[N]; char *p; char *end; for (p = array, end = array + N; p < end; ++p) do_something(p);
When you want to add an element to the end of your array, use push(). If you need to add an element to the beginning of your array, try unshift(). And you can add arrays together using concat().
A null or zero value marking the end of an array is the literal equivalent of the null char for an string.
The C standard does not define the behavior of (&array)[1]
.
Consider &array + 1
. This is defined by the C standard, for two reasons:
&array
is a pointer to a single object (that is itself an array, but the pointer arithmetic is for the pointer-to-the-array, not a pointer-to-an-element).So &array + 1
is defined pointer arithmetic that points just beyond the end of array
.
However, by definition of the subscript operator, (&array)[1]
is *(&array + 1)
. While the &array + 1
is defined, applying *
to it is not. C 2018 6.5.6 8 explicitly tells us, about result of pointer arithmetic, “If the result points one past the last element of the array object, it shall not be used as the operand of a unary *
operator that is evaluated.”
Because of the way most compilers are designed, the code in the question may move data around as you desire. However, this is not a behavior you should rely on. You can obtain a good pointer to just beyond the last element of the array with char *End = array + sizeof array / sizeof *array;
. Then you can use End[-1]
to refer to the last element, End[-2]
to refer to the penultimate element, and so on.
Although the Standard specifies that arrayLvalue[i] means (*((arrayLvalue)+(i)))
, which would be processed by taking the address of the first element of arrayLvalue
, gcc sometimes treats []
, when applied to an array-type value or lvalue, as an operator which behaves line an indexed version of .member
syntax, yielding a value or lvalue which the compiler will treat as being part of the array type. I don't know if this is ever observable when the array-type operand isn't a member of a struct or union, but the effects are clearly demonstrable in cases where it is, and I know of nothing that would guarantee that similar logic wouldn't be applied to nested arrays.
struct foo {unsigned char x[12]};
int test1(struct foo *p1, struct foo *p2)
{
p1->x[0] = 1;
p2->x[1] = 2;
return p1->x[0];
}
int test2(struct foo *p1, struct foo *p2)
{
char *p;
p1->x[0] = 1;
(&p2->x[0])[1] = 2;
return p1->x[0];
}
The code gcc generates for test1
will always return 1, while the generated code for test2
will return whatever is in p1->x[0]. I am unaware of anything in the Standard or the documentation for gcc that would suggest the two functions should behave differently, nor how one should force a compiler to generate code that would accommodate the case where p1
and p2
happen to identify overlapping parts of an allocated block in the event that should be necessary. Although the optimization used in test1()
would be reasonable for the function as written, I know of no documented interpretation of the Standard that would treat that case as UB but define the behavior of the code if it wrote to p2->x[0]
instead of p2->x[1]
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With