Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I read any readable valid memory location via a (unsigned) char* in C++?

My search foo seems lacking today.

I would like to know if it is legal according to std C++ to inspect "any" memory location via an (unsigned(?)) char*. By any location I mean any valid address of an object or array (or inside an array) inside the program.

By way of example:

void passAnyObjectOrArrayOrSomethingElseValid(void* pObj) {
   unsigned char* pMemory = static_cast<unsigned char*>(pObj)
   MyTypeIdentifyier x = tryToFigureOutWhatThisIs(pMemory);
}

Disclaimer: This question is purely academical. I do not intend to put this into production code! By legal I mean if it's really legal according to the standard, that is if it would work on 100% of all implementations. (Not just on x86 or some common hardware.)

Sub-question: Is static_cast the right tool to get from the void* address to the char* pointer?

like image 666
Martin Ba Avatar asked Jul 09 '11 12:07

Martin Ba


People also ask

What is unsigned char used for?

unsigned char is a character datatype where the variable consumes all the 8 bits of the memory and there is no sign bit (which is there in signed char). So it means that the range of unsigned char data type ranges from 0 to 255.

What is unsigned char pointer in C?

In C, unsigned char is the only type guaranteed to have no trapping values, and which guarantees copying will result in an exact bitwise image. (C++ extends this guarantee to char as well.) For this reason, it is traditionally used for "raw memory" (e.g. the semantics of memcpy are defined in terms of unsigned char ).

What is the difference between char and unsigned char?

An unsigned type can only represent postive values (and zero) where as a signed type can represent both positive and negative values (and zero). In the case of a 8-bit char this means that an unsigned char variable can hold a value in the range 0 to 255 while a signed char has the range -128 to 127.

Why do we need signed and unsigned char?

While the char data type is commonly used to represent a character (and that's where it gets its name) it is also used when a very small amount of space, typically one byte, is needed to store a number. A signed char can store a number from -128 to 127, and an unsigned char can store a number from 0 to 255.


2 Answers

C++ assumes strict aliasing, which means that two pointers of fundamentally different type do not alias the same value.

However, as correctly pointed out by bdonlan, the standard makes an exception for char and unsigned char pointers.

Thus, in general this is undefined behaviour for any pointer type to read any deliberate address (which might be any type), but for the particular case of unsigned char as in the question it is allowed (ISO 14882:2003 3.10(15)).

static_cast does compile-time type checking, so it is unlikely to always work. In such a case, you will want reinterpret_cast.

like image 56
Damon Avatar answered Sep 21 '22 00:09

Damon


Per ISO/IEC 9899:1999 (E) §6.5/7:

 7. An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

  • a type compatible with the effective type of the object,
  • [...]
  • a character type

So it is legal (in C) to dereference and examine a (valid) pointer via unsigned char. However, the contents you'll find there are unspecified; tryToFigureOutWhatThisIs has no well-defined way of actually figuring out what it's looking at. I don't have a copy of the C++ spec here, but I suspect it uses the same definition, in order to maintain compatibility.

like image 45
bdonlan Avatar answered Sep 21 '22 00:09

bdonlan