One often needs to read from memory one byte at a time, like in this naive memcpy()
implementation:
void *memcpy(void *dest, const void *src, size_t n)
{
char *from = (char *)src;
char *to = (char *)dest;
while(n--) *to++ = *from++;
return dest;
}
However, I sometimes see people explicitly use unsigned char *
instead of just char *
.
Of course, char
and unsigned char
may not be equal. But does it make a difference whether I use char *
, signed char *
, or unsigned char *
when bytewise reading/writing memory?
UPDATE: Actually, I'm fully aware that c=200
may have different values depending on the type of c
. What I am asking here is why people sometimes use unsigned char *
instead of just char *
when reading memory, e.g. in order to store an uint32_t
in a char[4]
.
A signed char is a signed value which is typically smaller than, and is guaranteed not to be bigger than, a short . An unsigned char is an unsigned value which is typically smaller than, and is guaranteed not to be bigger than, a short .
All signed character values range from -128 to 127. All unsigned character values range from 0 to 255. The /J compiler option changes the default type for char from signed char to unsigned char .
On x86 systems char is generally signed. On arm systems it is generally unsigned (Apple iOS is an exception).
You should use unsigned char
. The C99 standard says that unsigned char
is the only type guaranteed to be dense (no padding bits), and also defines that you may copy any object (except bitfields) exactly by copying it into an unsigned char
array, which is the object representation in bytes.
The sensible interepretation of this is to me, that if you use a pointer to access an object as bytes, you should use unsigned char
.
Reference: http://blackshell.com/~msmud/cstd.html#6.2.6.1 (From C1x draft C99)
This is one point where C++ differs from C. Generally speaking, C only
guarantees that raw memory access works for unsigned char
; char
may
be signed, and on a 1's complement or signed magnitude machine, a -0
might be converted to +0 automatically, changing the bit pattern. For
some reason (unknown to me), the C++ committee extends the guarantees
supporting transparent copy (no change in bit patterns) to char
, as
well as unsigned char
; on a 1's complement or signed magnitude
machine, the implementors have no choice but to make plain char
unsigned, in order to avoid such side effects. (And of course, most
programmers today aren't concerned by such machines anyway.)
Anyway, the end result is that older programmers, who come from a C
background (and maybe have actually worked on a 1's complement or a
signed magnitude machine) will automatically use unsigned char
. It's
also a frequent convention to reserve plain char
for character data
uniquely, with signed char
for very small integral values, and
unsigned char
for raw memory, or when bit manipulation is intended.
Such a rule allows the reader to distinguish between different uses
(provided it is followed religiously).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With