Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Casting a pointer does not produce an lvalue. Why?

After posting one of my most controversial answers here, I dare to ask a few questions and eventually fill some gaps in my knowledge.

Why isn't an expression of the kind ((type_t *) x) considered a valid lvalue, assuming that x itself is a pointer and an lvalue, not just some expression?

I know many will say "the standard disallows it", but from a logical standpoint it seems reasonable. What is the reason that the standard disallows it? After all, any two pointers are of the same size and the pointer type is just a compile-time abstraction that indicates the appropriate offset that should be applied when doing pointer arithmetic.

like image 960
Blagovest Buyukliev Avatar asked Sep 16 '11 14:09

Blagovest Buyukliev


2 Answers

An even better example, unary + yields an rvalue, as does x+0.

The underlying reason is that all these things, including your cast, create a new value. Casting a value to the type it already is, likewise creates a new value, never mind whether pointers to different types have the same representation or not. In some cases, the new value happens to be equal to the old value, but in principle it's a new value, it's not intended to be used as a reference to the old object, and that's why it's an rvalue.

For these to be lvalues, the standard would have to add some special cases that certain operations when used on an lvalue result in a reference to the old object, instead of a new value. AFAIK there's no great demand for those special cases.

like image 70
Steve Jessop Avatar answered Oct 13 '22 18:10

Steve Jessop


Actually you are right and wrong at the same time.

In C there is an ability to safely typecast any lvalue to any lvalue. However, the syntax is a bit different than your straight forward approach:

lvalue pointers can be casted to lvalue pointers of a different type like this in C:

char *ptr;

ptr = malloc(20);
assert(ptr);
*(*(int **)&ptr)++ = 5;

As malloc() is required to fulfill all alignment requirements, this also is an acceptable use. However, following is not portable and may lead to an exception due to wrong alignment on certain machines:

char *ptr;

ptr = malloc(20);
assert(ptr);
*ptr++ = 0;
*(*(int **)&ptr)++ = 5;  /* can throw an exception due to misalignment */

To sum it up:

  • If you cast a pointer, this leads to an rvalue.
  • Using * on a pointer leads to an lvalue (*ptr can be assigned to).
  • ++ (like in *(arg)++) needs an lvalue to operate on (arg must be an lvalue)

Hence ((int *)ptr)++ fails, because ptr is an lvalue, but (int *)ptr is not. The ++ can be rewritten as ((int *)ptr += 1, ptr-1), and it's the (int *)ptr += 1 which fails due to the cast resulting in a pure rvalue.


Please note that it is not a language shortcoming. Casting must not produce lvalues. Look at following:

(double *)1   = 0;
(double)ptr   = 0;
(double)1     = 0;
(double *)ptr = 0;

The first 3 do not compile. Why would anybody expect the 4th line to compile? Programming languages should never expose such surprising behavior. Even more, this may lead to some unclear behavior of programs. Consider:

#ifndef DATA
#define DATA double
#endif
#define DATA_CAST(X) ((DATA)(X))

DATA_CAST(ptr) = 3;

This cannot compile, right? However if your expectation helds, this suddenly compiles with cc -DDATA='double *'! From a stability point of view it is important not to introduces such contextual lvalues for certain casts.

The right thing for C is that there are either lvalues or there are not, and this shall not depend on some arbitrary context which might be surprising.


As noted by Jens there already is one operator to create lvalues. It's the pointer dereferencing operator, the "unary *" (as in *ptr).

Note that *ptr can be written as 0[ptr] and *ptr++ can be written as 0[ptr++]. Array subscripts are lvalues, so *ptr is an lvalue, too.

Wait, what? 0[ptr] must be an error, right?

Actually, no. Try it! This is valid C. Following C program is valid on Intel 32/64 bit in all respects, so it compiles and runs successfully:

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

int
main()
{
  char *ptr;

  ptr = malloc(20);
  assert(ptr);

  0[(*(int **)&ptr)++] = 5;
  assert(ptr[-1]==0 && ptr[-2]==0 && ptr[-3]==0 && ptr[-4]==5);

  return 0;
}

In C we can have it both. Casts, which never create lvalues. And the ability to use casts in a way, that we can keep the lvalue property alive.

But to get an lvalue out of casting, two more steps are needed:

  • Before the cast, get the address of the original lvalue. As it is an lvalue, you always can get this address.
  • Cast to the pointer of the desired type (usually the desired type is a pointer as well, so you have a pointer to that pointer).
  • After the cast, dereference this additional pointer, which gives you an lvalue again.

Hence instead of the wrong *((int *)ptr)++ we can write *(*(int **)&ptr)++. This also makes sure, that ptr in this expression must be an lvalue already. Or to write this with the help of the C Preprocessor:

#define LVALUE_CAST(TYPE,PTR) (*((TYPE *)&(PTR)))

So for any passed in void *ptr (which might disguises as char *ptr), we can write:

*LVALUE_CAST(int *,ptr)++ = 5;

Except the usual pointer arithmetic caveats (abnormal program termination or undefined behavior on incompatible types, which mostly stems from aligment issues), this is proper C.

like image 25
Tino Avatar answered Oct 13 '22 19:10

Tino