I've been working with hardware APIs for a long time and almost all of the APIs that I've been working had a C interface. So, in many times I was working with naked new
s, unsecure buffering and many C functions wrapped with C++ code. In the end, the frontier between C pure code and C++ pure code was messed up in my mind (and I don't know if clarify this frontier is useful at all).
Now, due a some new coding style requirements, I need to refactor all the code suspected to be insecure into more secure one written in C++ (assuming that the C++ code would be more secure) the final goal is to increase the code security using the tools that C++ brings on.
So, in order to get rid of all my confusion, I'm asking for help about a couple of topics of C/C++.
memcpy
vs std::copy
AFAIK memcpy
is a function that lies on the C libraries, so it isn't C++ish; in the other hand std::copy
is a function into the STL so it's pure C++.
std::copy
will call std::memcpy
(into the cstring
header) if the data is trivially copiable.memcpy
calls into std::copy
calls would make the code more "pure C++"?.To deal with the new code style requirements I've decided to go on with the memcpy
refactor after all, there's some questions about the memcpy
and std::copy
:
memcpy
is type insecure, because it works with raw void pointers that can manage any kind of pointer regardless of it's type but at the same time is very flexible, the std::copy
lacks of this flexibility assuring the type safety. At the first sight, memcpy
is the best choice to work with serialization and deserialization routines (that's my real case of use indeed), for example, to send some values through a custom serial port library:
void send(const std::string &value)
{
const std::string::size_type Size(value.size());
const std::string::size_type TotalSize(sizeof(Size) + value.size());
unsigned char *Buffer = new unsigned char[TotalSize];
unsigned char *Current = Buffer;
memcpy(Current, &Size, sizeof(Size));
Current += sizeof(Size);
memcpy(Current, value.c_str(), Size);
sendBuffer(Buffer, TotalSize);
delete []Buffer;
}
The code above works fine, but looks horrible; we're getting rid of the std::string
encapsulation accesing it's internal memory through the std::string::c_str()
method, we need to take care of allocations and deallocations of dynamic memory, play with pointers and treat all values as unsigned chars (see the next part), the question is: there's a better way to do this?
My first attempts at trying to solve the above problems using std::copy
doesn't satisfy me altogether:
void send(const std::string &value)
{
const std::string::size_type Size(value.size());
const std::string::size_type TotalSize(sizeof(Size) + value.size());
std::vector<unsigned char> Buffer(TotalSize, 0);
std::copy(&Size, &Size + 1, Buffer.begin());
std::copy(value.begin(), value.end(), Buffer.begin() + sizeof(Size));
sendBuffer(Buffer.data(), TotalSize);
}
With the above approach, the memory management isn't a problem anymore, the std::vector
takes the responsability of allocating, store and finally deallocate the data at the end of the scope, but the calls mixing std::copy
with pointer arithmetics and iterators arithmetics is pretty annoying and in the end, I'm ignoring the std::vector
encapsulation in the sendBuffer
call after all.
After the previous tries, I've coded something with std::stringstream
s but the results were even worse and now, I'm wondering if:
boost::serialization
, but for now I'm not allowed to integrate it).And:
std::copy
for serialization/deserialization purposes? (if any).std::copy
rationale is limited for copying containers or arrays and using it for raw memory is a bad choice?alloc
/free
vs new
/delete
vs std::allocator
The other big topic is the allocation of memory. AFAIK the malloc
/free
functions aren't forbidden into the C++ scope although they're from C. And the new
/delete
operators are from the C++ scope and they aren't ANSI C.
new
/delete
can be used in ANSI C?Assuming that I need to refactor all C-flavoured code into C++ code, I'm getting rid of all the alloc
/free
spreaded arround some legacy code and I've found that reserving dynamic memory is quite confusing, the void type doesn't carry any information about size, because of that it's impossible to reserve a data buffer using void as type:
void *Buffer = new void[100]; // <-- How many bytes is each 'void'?
Due the lack of pure-raw-binary-data-pointers, is a common practice to create pointers to unsigned char
. The char
in order to equal the elements count and size. And the unsigned
in order to avoid unexpected signed-unsigned conversions during the data copy. Maybe it's a common practice, but it's a mess... unsigned char
isn't int
nor float
nor my_awesome_serialization_struct
if I'm forced to choose some kind of dummy pointer to binary data I will prefer void *
instead of unsigned char *
.
So when I need a dynamic buffer for serialization/deserialization purposes there's no way I can avoid the unsigned char *
stuff in order to refactor into a type secure buffer management; but when I was rage-refactoring all the alloc
/free
pairs into new
/delete
pairs I read about the std::allocator
.
The std::allocator
allows to reserve memory chunks in a type-safe way, at the first sight I bet that it will be useful, but there's no great differences between allocating with std::allocator<int>::allocate
or new int
or so I thought, same was for std::allocator<int>::deallocate
and delete int
.
And now, I've lost the north about the dynamic memory management, that's why I'm asking:
const char *
for serialization/deserialization memory Buffers?std::allocator
and what's its's use on serialization/deserialization scope? (if any).Thanks for your attention!
Serialization is the process of converting an object into a stream of bytes to store the object or transmit it to memory, a database, or a file. Its main purpose is to save the state of an object in order to be able to recreate it when needed. The reverse process is called deserialization.
Serialization and deserialization work together to transform/recreate data objects to/from a portable format. Serialization enables us to save the state of an object and recreate the object in a new location. Serialization encompasses both the storage of the object and exchange of data.
Serialize and Deserialize Binary Tree in C++ As we know that the serialization is the process of converting a data structure or object into a sequence of bits so we can store them in a file or memory buffer, and that can be reconstructed later in the same or another computer environment.
December 17, 2021. Serialization in C# is the process of bringing an object into a structure that is composed in memory. Deserialization is the opposite of serialization. It involves retrieving the serialized object so that it can be stored in memory.
My experience is, that type safety in C++ means not only that the compiler complains on type mismatches. It rather means you should in general not have to take care about the memory layout of your data. In fact, the C++ standard has only very few requirements on the memory layout of certain data types.
Your serialization is based on direct memory access, so, I'm afraid there won't be a simple "pure" C++ solution and particularly no general compiler/platform independent solution.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With