Standard C++ code for serialization/deserialization purposes

Tags:

I've been working with hardware APIs for a long time and almost all of the APIs that I've been working had a C interface. So, in many times I was working with naked news, unsecure buffering and many C functions wrapped with C++ code. In the end, the frontier between C pure code and C++ pure code was messed up in my mind (and I don't know if clarify this frontier is useful at all).

Now, due a some new coding style requirements, I need to refactor all the code suspected to be insecure into more secure one written in C++ (assuming that the C++ code would be more secure) the final goal is to increase the code security using the tools that C++ brings on.

So, in order to get rid of all my confusion, I'm asking for help about a couple of topics of C/C++.

`memcpy` vs `std::copy`

AFAIK memcpy is a function that lies on the C libraries, so it isn't C++ish; in the other hand std::copy is a function into the STL so it's pure C++.

But, this is true? after all, the std::copy will call std::memcpy (into the cstring header) if the data is trivially copiable.
Refactoring all the memcpy calls into std::copy calls would make the code more "pure C++"?.

To deal with the new code style requirements I've decided to go on with the memcpy refactor after all, there's some questions about the memcpy and std::copy:

memcpy is type insecure, because it works with raw void pointers that can manage any kind of pointer regardless of it's type but at the same time is very flexible, the std::copy lacks of this flexibility assuring the type safety. At the first sight, memcpy is the best choice to work with serialization and deserialization routines (that's my real case of use indeed), for example, to send some values through a custom serial port library:

void send(const std::string &value)
{
    const std::string::size_type Size(value.size());
    const std::string::size_type TotalSize(sizeof(Size) + value.size());
    unsigned char *Buffer = new unsigned char[TotalSize];
    unsigned char *Current = Buffer;

    memcpy(Current, &Size, sizeof(Size));
    Current += sizeof(Size);

    memcpy(Current, value.c_str(), Size);

    sendBuffer(Buffer, TotalSize);

    delete []Buffer;
}

The code above works fine, but looks horrible; we're getting rid of the std::string encapsulation accesing it's internal memory through the std::string::c_str() method, we need to take care of allocations and deallocations of dynamic memory, play with pointers and treat all values as unsigned chars (see the next part), the question is: there's a better way to do this?

My first attempts at trying to solve the above problems using std::copy doesn't satisfy me altogether:

void send(const std::string &value)
{
    const std::string::size_type Size(value.size());
    const std::string::size_type TotalSize(sizeof(Size) + value.size());

    std::vector<unsigned char> Buffer(TotalSize, 0);

    std::copy(&Size, &Size + 1, Buffer.begin());
    std::copy(value.begin(), value.end(), Buffer.begin() + sizeof(Size));

    sendBuffer(Buffer.data(), TotalSize);
}

With the above approach, the memory management isn't a problem anymore, the std::vector takes the responsability of allocating, store and finally deallocate the data at the end of the scope, but the calls mixing std::copy with pointer arithmetics and iterators arithmetics is pretty annoying and in the end, I'm ignoring the std::vector encapsulation in the sendBuffer call after all.

After the previous tries, I've coded something with std::stringstreams but the results were even worse and now, I'm wondering if:

There's a way to serialize objects and values in a safe way, without breaking encapsulations, without excesive or confusing pointer/iterator arithmetics and without dynamic memory management or it's just an impossible goal? (yes, I've heard about boost::serialization, but for now I'm not allowed to integrate it).

And:

What's the best use of std::copy for serialization/deserialization purposes? (if any).
The std::copy rationale is limited for copying containers or arrays and using it for raw memory is a bad choice?

`alloc`/`free` vs `new`/`delete` vs `std::allocator`

The other big topic is the allocation of memory. AFAIK the malloc/free functions aren't forbidden into the C++ scope although they're from C. And the new/delete operators are from the C++ scope and they aren't ANSI C.

I'm right?
new/delete can be used in ANSI C?

Assuming that I need to refactor all C-flavoured code into C++ code, I'm getting rid of all the alloc/free spreaded arround some legacy code and I've found that reserving dynamic memory is quite confusing, the void type doesn't carry any information about size, because of that it's impossible to reserve a data buffer using void as type:

void *Buffer = new void[100]; // <-- How many bytes is each 'void'?

Due the lack of pure-raw-binary-data-pointers, is a common practice to create pointers to unsigned char. The char in order to equal the elements count and size. And the unsigned in order to avoid unexpected signed-unsigned conversions during the data copy. Maybe it's a common practice, but it's a mess... unsigned char isn't int nor float nor my_awesome_serialization_struct if I'm forced to choose some kind of dummy pointer to binary data I will prefer void * instead of unsigned char *.

So when I need a dynamic buffer for serialization/deserialization purposes there's no way I can avoid the unsigned char * stuff in order to refactor into a type secure buffer management; but when I was rage-refactoring all the alloc/free pairs into new/delete pairs I read about the std::allocator.

The std::allocator allows to reserve memory chunks in a type-safe way, at the first sight I bet that it will be useful, but there's no great differences between allocating with std::allocator<int>::allocate or new int or so I thought, same was for std::allocator<int>::deallocate and delete int.

And now, I've lost the north about the dynamic memory management, that's why I'm asking:

There's a good C++ practice involving the dynamic memory management for serialization/deserialization purposes that grants type-safe management?
Is possible to avoid the use of const char * for serialization/deserialization memory Buffers?
What's the rationale of std::allocator and what's its's use on serialization/deserialization scope? (if any).

Thanks for your attention!

920

asked Oct 15 '12 13:10

PaperBirdMaster

1 Answers

My experience is, that type safety in C++ means not only that the compiler complains on type mismatches. It rather means you should in general not have to take care about the memory layout of your data. In fact, the C++ standard has only very few requirements on the memory layout of certain data types.

Your serialization is based on direct memory access, so, I'm afraid there won't be a simple "pure" C++ solution and particularly no general compiler/platform independent solution.

101

answered Oct 19 '22 23:10

bjhend

Related questions
                            
                                Using a handle to collect output from CreateProcess()
                            
                                When are global static const variables being initialized?
                            
                                This is illegal right?
                            
                                How to initialize with multiple return values in c++(0x)
                            
                                Problem with Tail Recursion in g++
                            
                                MAX_PATH limitation in Boost.Filesystem
                            
                                Rounding differences on Windows vs Unix based system in sprintf
                            
                                Enumerations and pointer-to-members
                            
                                Strange concurrency issue with STL / OpenMP in 64 bit builds
                            
                                VC8 to VC10 - LNK2005 errors
                            
                                C++ Decision Tree Implementation Question: Think In Code
                            
                                MFC CMenu tooltip not being displayed
                            
                                Unable to use lambda in initialization list of template in C++
                            
                                How to simulate the nonexistent find_first_not_of function?
                            
                                mixin terminology
                            
                                Automatic compile-time factory registration of class templates in C++
                            
                                Qt Haptic Feedback on android
                            
                                C and C++ compilers with "aggressive" volatile semantics
                            
                                QWebView HTML5 GeoLocation on Android platform
                            
                                "Live" code and rapid prototyping with C++ and LLVM JIT?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Standard C++ code for serialization/deserialization purposes

Tags:

c++

serialization

buffer

allocator

`memcpy` vs `std::copy`

`alloc`/`free` vs `new`/`delete` vs `std::allocator`

PaperBirdMaster

People also ask

1 Answers

bjhend

Recent Activity

Donate For Us

Standard C++ code for serialization/deserialization purposes

Tags:

c++

serialization

buffer

allocator

memcpy vs std::copy

alloc/free vs new/delete vs std::allocator

PaperBirdMaster

People also ask

1 Answers

bjhend

Related questions

Recent Activity

Donate For Us

`memcpy` vs `std::copy`

`alloc`/`free` vs `new`/`delete` vs `std::allocator`