Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why memcpy/memmove reverse data when copying int to bytes buffer?

Tags:

c++

c

So, my question is pretty simple:

I need to fill a char/unsigned char array with some information. Some values in the middle are taken from short/int types and this is what happens:

Code:

int foo = 15; //0x0000000F
unsigned char buffer[100]={0};

..
memcpy(&buffer[offset], &foo, sizeof(int)); //either memmove
...

Output:

... 0F 00 00 00 ..

So by now I wrote a function to reverse this fields, but I don't find this a smart solution, as it impacts execution time, resources, and time to develop.

Is there an easier way to do it?

Edit: As many of you have pointed, this behaviour is produced due to the little endian processor, but my problem still remains. I need to fill this buffer with int/short values in big-endian, as i need to serialize tha data to be transmitted to a machine which either works in little/big endian, doesn't matter as this protocol is already defined so.

Note: For compiling in C++

like image 629
Joster Avatar asked Mar 15 '17 14:03

Joster


People also ask

Does memcpy copy byte by byte?

memcpy() — Copy Bytes The memcpy() function copies count bytes of src to dest . The behavior is undefined if copying takes place between objects that overlap. The memmove() function allows copying between objects that might overlap.

Why are bytes reversed?

Description. The Byte Reversal block changes the order of the bytes in the input data. Use this block when your process communicates between processors that use different endianness. For example, use this block for communication between Intel® processors that are little-endian and others that are big-endian.

Does memcpy use Memmove?

memmove() is similar to memcpy() as it also copies data from a source to destination. memcpy() leads to problems when source and destination addresses overlap as memcpy() simply copies data one by one from one location to another.

Is Memmove slower than memcpy?

memcpy is still a little bit slower than memmove.


2 Answers

It's because the processor architecture you use is little endian. Multibyte numbers (anything bigger than a uint8_t) are stored with the least significant byte at the lowest address.

Edit

What you do about it really depends on what the buffer is for. If you are only going to be using the buffer internally, forget about byte swapping, you'll have to do it in both directions and its a waste of time.

If it is for some external entity e.g. a file or a network protocol, the specification of the file or network protocol will say what the endianness is. For example, network byte order for all the Internet protocols is effectively big endian. The networking library provides a family of functions to convert values for use in sending and receiving Internet protocol messages. Se for instance

https://linux.die.net/man/3/htonl

If you want to roll your own, the portable way is to use bit shifts e.g.

void writeUInt32ToBufferBigEndian(uint32_t number, uint8_t* buffer)
{
    buffer[0] = (uint8_t) ((number >> 24) & 0xff);
    buffer[1] = (uint8_t) ((number >> 16) & 0xff);
    buffer[2] = (uint8_t) ((number >> 8) & 0xff);
    buffer[3] = (uint8_t) ((number >> 0) & 0xff);
}
like image 96
JeremyP Avatar answered Oct 31 '22 18:10

JeremyP


Neither memcpy, nor memmove reverse data when copying objects. The byte values you observe when dumping the character array correspond to the way the 32-bit value 15 (0F in hexadecimal) is stored in memory on your environment.

Its appears to be in little endian order, 0F 00 00 00, which is very common in desktop and laptop computers. Other systems, such as many smartphones, might store integer values in big-endian order, 00 00 00 0F, which you consider more natural, but both methods are equally correct. It is just a matter of convention. Little-endian order means the byte with the lowest value bits is stored first, while big-endian is the opposite: the byte with the highest value bits is stored first.

A comprehensive article on Wikipedia covers this subject in depth.

In your application, you must specify in which order the binary value is expected to be stored, and if you decide on big-endian, I suggest you use this code for portability across environments:

#include <stdint.h>

int foo = 15; //0x0000000F
unsigned char buffer[100] = { 0 };

...
buffer[offset + 0] = ((uint32_t)foo >> 24) & 0xFF;
buffer[offset + 1] = ((uint32_t)foo >> 16) & 0xFF;
buffer[offset + 2] = ((uint32_t)foo >>  8) & 0xFF;
buffer[offset + 3] = ((uint32_t)foo >>  0) & 0xFF;
...
like image 23
chqrlie Avatar answered Oct 31 '22 16:10

chqrlie