Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

memcpy adds ff ff ff to the beginning of a byte

Tags:

c++

memcpy

I have an array that is like this:

unsigned char array[] = {'\xc0', '\x3f', '\x0e', '\x54', '\xe5', '\x20'};
unsigned char array2[6];

When I use memcpy:

memcpy(array2, array, 6);

And print both of them:

printf("%x %x %x %x %x %x", array[0],  // ... etc
printf("%x %x %x %x %x %x", array2[0], // ... etc

one prints like:

c0 3f e 54 e5 20

but the other one prints

ffffffc0 3f e 54 ffffffe5 20

what happened?

like image 838
Hock Avatar asked Aug 18 '10 13:08

Hock


People also ask

Does memcpy copy byte by byte?

Memcpy copies data bytes by byte from the source array to the destination array. This copying of data is threadsafe.

When should memcpy be used?

The function memcpy() is used to copy a memory block from one location to another. One is source and another is destination pointed by the pointer. This is declared in “string. h” header file in C language.


2 Answers

I've turned your code into a complete compilable example. I also added a third array of a 'normal' char which on my environment is signed.

#include <cstring>
#include <cstdio>

using std::memcpy;
using std::printf;

int main()
{

        unsigned char array[] = {'\xc0', '\x3f', '\x0e', '\x54', '\xe5', '\x20'};
        unsigned char array2[6];
        char array3[6];

        memcpy(array2, array, 6);
        memcpy(array3, array, 6);

        printf("%x %x %x %x %x %x\n", array[0], array[1], array[2], array[3], array[4], array[5]);
        printf("%x %x %x %x %x %x\n", array2[0], array2[1], array2[2], array2[3], array2[4], array2[5]);
        printf("%x %x %x %x %x %x\n", array3[0], array3[1], array3[2], array3[3], array3[4], array3[5]);

        return 0;
}

My results were what I expected.

c0 3f e 54 e5 20
c0 3f e 54 e5 20
ffffffc0 3f e 54 ffffffe5 20

As you can see, only when the array is of a signed char type do the 'extra' ff get appended. The reason is that when memcpy populates the array of signed char, the values with a high bit set now correspond to negative char values. When passed to printf the char are promoted to int types which effectively means a sign extension.

%x prints them in hexadecimal as though they were unsigned int, but as the argument was passed as int the behaviour is technically undefined. Typically on a two's complement machine the behaviour is the same as the standard signed to unsigned conversion which uses mod 2^N arithmetic (where N is the number of value bits in an unsigned int). As the value was only 'slightly' negative (coming from a narrow signed type), post conversion the value is close to the maximum possible unsigned int value, i.e. it has many leading 1's (in binary) or leading f in hex.

like image 163
CB Bailey Avatar answered Oct 19 '22 19:10

CB Bailey


The problem is not memcpy (unless your char type really is 32 bits, rather than 8), it looks more like integer sign extension while printing.

you may want to change your printf to explicitly use unsigned char conversion, ie.

printf("%hhx %hhx...", array2[0], array2[1],...);

As a guess, it's possible that your compiler/optimizer is handling array (whose size and contents are known at compile time) and array2 differently, pushing constant values onto the stack in the first place and erroneously pushing sign extended values in the second.

like image 29
Hasturkun Avatar answered Oct 19 '22 18:10

Hasturkun