I want to serialize and deserialize a class Mango
. So I have created a function serialize
and deserialize
respectively.
? serialize(Mango &Man) /// What should be return ?
{
}
Mango deserialize( ? ) /// What should be function parameter ?
{
}
I don't know how to implement it very efficiently in terms of speed, portability , memory because it contains 10 members of custom data types ( I just mention one but they are all similar) which again are very complex.
I want suggestions for implementation for eg : what should be the return type of serialize function ? vector of bytes ie std::vector<uint8_t> serialize(Mango &Man)
? or should it be nothing like just serializing it into bytes and storing it in memory? or any other way?
Mango Class
class Mango
{
public:
const MangoType &getMangoType() const { return typeMan; }
MangoType &getMangoType() { return typeMan; }
private:
// There are many members of different types : I just mention one.
MangoType typeMan;
};
Data type classes
//MangoType Class
class MangoType
{
/// It only has one member ie content
public:
/// Getter of content vector.
std::vector<FuntionMango> &getContent() noexcept { return Content; }
private:
/// \name Data of MangoType.
std::vector<FuntionMango> Content;
};
/// FuntionMango class.
class FuntionMango
{
public:
/// Getter of param types.
const std::vector<ValType> &getParamTypes() const noexcept
{
return ParamTypes;
}
std::vector<ValType> &getParamTypes() noexcept { return ParamTypes; }
/// Getter of return types.
const std::vector<ValType> &getReturnTypes() const noexcept
{
return ReturnTypes;
}
std::vector<ValType> &getReturnTypes() noexcept { return ReturnTypes; }
private:
/// \name Data of FuntionMango.
std::vector<ValType> ParamTypes;
std::vector<ValType> ReturnTypes;
};
//ValType Class
enum class ValType : uint8_t
{
#define UseValType
#define Line(NAME, VALUE, STRING) NAME = VALUE
#undef Line
#undef UseValType
};
I want to know the best possible implementation plan in terms of speed and memory for serialize and deserialize functions.
Note : 1) I do not want to transfer it over the network. My usecase is that it is very time consuming to load data everytime in Mango class ( It comes after computation ). So I want to serialize it .. so that next time I want it , I can just deserialize the previous serialized data
2) I do not want to use library which requires linking like boost serialization
directly. But is there any way to use it
as header only ?
I commented:
Perhaps the examples here give you some inspiration. It's possible to write them without any boost, obviously Boost Serialization Binary Archive giving incorrect output
Because I hate when people say "obviously" on a Q&A site, let me show you. I'd suggest the interface to look like this:
std::vector<uint8_t> serialize(Mango const& Man);
Mango deserialize(std::span<uint8_t const> data);
Alternatively, for file IO you could support e.g.:
void serialize_to_stream(std::ostream& os, Mango const& Man);
void deserialize(std::istream& is, Mango& Man);
Using the approach from the linked example, the suggested implementations would look like:
std::vector<uint8_t> serialize(Mango const& Man) {
std::vector<uint8_t> bytes;
do_generate(back_inserter(bytes), Man);
return bytes;
}
Mango deserialize(std::span<uint8_t const> data) {
Mango result;
auto f = begin(data), l = end(data);
if (!do_parse(f, l, result))
throw std::runtime_error("deserialize");
return result;
}
void serialize_to_stream(std::ostream& os, Mango const& Man) {
do_generate(std::ostreambuf_iterator<char>(os), Man);
}
void deserialize(std::istream& is, Mango& Man) {
Man = {}; // clear it!
std::istreambuf_iterator<char> f(is), l{};
if (!do_parse(f, l, Man))
throw std::runtime_error("deserialize");
}
Of course, that assumes do_generate
and do_parse
customizations for all the relevant types (ValType
, FunctionMango
, MangoType
, Mango
):
Live On Coliru
#include <algorithm>
#include <iomanip> // debug output
#include <iostream>
#include <string>
#include <vector>
#include <span>
namespace MangoLib {
// your requested signatures:
class Mango;
void serialize_to_stream(std::ostream& os, Mango const& Man);
void deserialize(std::istream& is, Mango& Man);
std::vector<uint8_t> serialize(Mango const& Man);
Mango deserialize(std::span<uint8_t const> data);
// your specified types (with some demo fill)
enum class ValType : uint8_t {
#define UseValType
#define Line(NAME, VALUE, STRING) NAME = VALUE
Line(void_, 0, "void"),
Line(int_, 1, "int"),
Line(bool_, 2, "bool"),
Line(string_, 3, "string"),
#undef Line
#undef UseValType
};
using ValTypes = std::vector<ValType>;
class FuntionMango {
public:
const ValTypes& getParamTypes() const noexcept { return ParamTypes; }
ValTypes& getParamTypes() noexcept { return ParamTypes; }
const ValTypes& getReturnTypes() const noexcept { return ReturnTypes; }
ValTypes& getReturnTypes() noexcept { return ReturnTypes; }
private:
ValTypes ParamTypes, ReturnTypes;
};
using FuntionMangos = std::vector<FuntionMango>;
class MangoType {
public:
FuntionMangos& getContent() noexcept { return Content; }
const FuntionMangos& getContent() const noexcept { return Content; }
private:
FuntionMangos Content;
};
class Mango {
public:
const MangoType& getMangoType() const { return typeMan; }
MangoType& getMangoType() { return typeMan; }
private:
MangoType typeMan;
// many other members
};
} // namespace MangoLib
namespace my_serialization_helpers {
////////////////////////////////////////////////////////////////////////////
// This namespace serves as an extension point for your serialization; in
// particular we choose endianness and representation of strings
//
// TODO add overloads as needed (signed integer types, binary floats,
// containers of... etc)
////////////////////////////////////////////////////////////////////////////
// decide on the max supported container capacity:
using container_size_type = std::uint32_t;
////////////////////////////////////////////////////////////////////////////
// generators
template <typename Out>
Out do_generate(Out out, std::string const& data) {
container_size_type len = data.length();
out = std::copy_n(reinterpret_cast<char const*>(&len), sizeof(len), out);
return std::copy(data.begin(), data.end(), out);
}
template <typename Out, typename T>
Out do_generate(Out out, std::vector<T> const& data) {
container_size_type len = data.size();
out = std::copy_n(reinterpret_cast<char const*>(&len), sizeof(len), out);
for (auto& el : data)
out = do_generate(out, el);
return out;
}
template <typename Out> Out do_generate(Out out, uint8_t const& data) {
return std::copy_n(&data, sizeof(data), out);
}
template <typename Out>
Out do_generate(Out out, uint16_t const& data) {
return std::copy_n(reinterpret_cast<char const*>(&data), sizeof(data), out);
}
template <typename Out>
Out do_generate(Out out, uint32_t const& data) {
return std::copy_n(reinterpret_cast<char const*>(&data), sizeof(data), out);
}
////////////////////////////////////////////////////////////////////////////
// parsers
template <typename It>
bool parse_raw(It& in, It last, char* raw_into, size_t n) { // length guarded copy_n
while (in != last && n) {
*raw_into++ = *in++;
--n;
}
return n == 0;
}
template <typename It, typename T>
bool parse_raw(It& in, It last, T& into) {
static_assert(std::is_trivially_copyable_v<T>);
return parse_raw(in, last, reinterpret_cast<char*>(&into), sizeof(into));
}
template <typename It>
bool do_parse(It& in, It last, std::string& data) {
container_size_type len;
if (!parse_raw(in, last, len))
return false;
data.resize(len);
return parse_raw(in, last, data.data(), len);
}
template <typename It, typename T>
bool do_parse(It& in, It last, std::vector<T>& data) {
container_size_type len;
if (!parse_raw(in, last, len))
return false;
data.clear();
data.reserve(len);
while (len--) {
data.emplace_back();
if (!do_parse(in, last, data.back()))
return false;
};
return true;
}
template <typename It>
bool do_parse(It& in, It last, uint8_t& data) {
return parse_raw(in, last, data);
}
template <typename It>
bool do_parse(It& in, It last, uint16_t& data) {
return parse_raw(in, last, data);
}
template <typename It>
bool do_parse(It& in, It last, uint32_t& data) {
return parse_raw(in, last, data);
}
}
namespace MangoLib {
template <typename Out> Out do_generate(Out out, ValType const& x) {
using my_serialization_helpers::do_generate;
return do_generate(out,
static_cast<std::underlying_type_t<ValType>>(x));
}
template <typename It> bool do_parse(It& in, It last, ValType& x) {
using my_serialization_helpers::do_parse;
std::underlying_type_t<ValType> tmp;
bool ok = do_parse(in, last, tmp);
if (ok)
x = static_cast<ValType>(tmp);
return ok;
}
template <typename Out> Out do_generate(Out out, FuntionMango const& x) {
using my_serialization_helpers::do_generate;
out = do_generate(out, x.getParamTypes());
out = do_generate(out, x.getReturnTypes());
return out;
}
template <typename It> bool do_parse(It& in, It last, FuntionMango& x) {
using my_serialization_helpers::do_parse;
return do_parse(in, last, x.getParamTypes()) &&
do_parse(in, last, x.getReturnTypes());
}
template <typename Out> Out do_generate(Out out, MangoType const& x) {
using my_serialization_helpers::do_generate;
out = do_generate(out, x.getContent());
return out;
}
template <typename It> bool do_parse(It& in, It last, MangoType& x) {
using my_serialization_helpers::do_parse;
return do_parse(in, last, x.getContent());
}
template <typename Out> Out do_generate(Out out, Mango const& x) {
out = do_generate(out, x.getMangoType());
return out;
}
template <typename It> bool do_parse(It& in, It last, Mango& x) {
return do_parse(in, last, x.getMangoType());
}
}
#include <cassert>
MangoLib::Mango makeMango() {
MangoLib::Mango mango;
using MangoLib::ValType;
MangoLib::FuntionMango f1;
f1.getParamTypes() = {ValType::bool_, ValType::string_};
f1.getReturnTypes() = {ValType::void_};
MangoLib::FuntionMango f2;
f2.getParamTypes() = {ValType::string_};
f2.getReturnTypes() = {ValType::int_};
mango.getMangoType().getContent() = {f1, f2};
return mango;
}
#include <fstream>
int main() {
auto const mango = makeMango();
auto const bytes = serialize(mango);
auto const roundtrip = serialize(MangoLib::deserialize(bytes));
assert(roundtrip == bytes);
// alternatively with file IO:
{
std::ofstream ofs("output.bin", std::ios::binary);
serialize_to_stream(ofs, mango);
}
// read back:
{
std::ifstream ifs("output.bin", std::ios::binary);
MangoLib::Mango from_file;
deserialize(ifs, from_file);
assert(serialize(from_file) == bytes);
}
std::cout << "\nDebug dump " << std::dec << bytes.size() << " bytes:\n";
for (auto ch : bytes)
std::cout << "0x" << std::hex << std::setw(2) << std::setfill('0')
<< static_cast<int>((uint8_t)ch) << " " << std::dec;
std::cout << "\nDone\n";
}
// suggested implementations:
namespace MangoLib {
std::vector<uint8_t> serialize(Mango const& Man) {
std::vector<uint8_t> bytes;
do_generate(back_inserter(bytes), Man);
return bytes;
}
Mango deserialize(std::span<uint8_t const> data) {
Mango result;
auto f = begin(data), l = end(data);
if (!do_parse(f, l, result))
throw std::runtime_error("deserialize");
return result;
}
void serialize_to_stream(std::ostream& os, Mango const& Man) {
do_generate(std::ostreambuf_iterator<char>(os), Man);
}
void deserialize(std::istream& is, Mango& Man) {
Man = {}; // clear it!
std::istreambuf_iterator<char> f(is), l{};
if (!do_parse(f, l, Man))
throw std::runtime_error("deserialize");
}
}
Which roundtrips correctly and prints the debug output:
Debug dump 25 bytes:
0x02 0x00 0x00 0x00 0x02 0x00 0x00 0x00 0x02 0x03 0x01 0x00 0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x03 0x01 0x00 0x00 0x00 0x01
Done
This assumes endianness is not an issue. Of course you might want to normalize endianness. You can do it manually (using ntoh
/hton
family e.g.), or you could use Boost Endian - which does not require linking to any boost library (Boost Endian is header-only).
E.g.: http://coliru.stacked-crooked.com/a/288829ec964a3ca9
As @Eljay says in a comment, the exact solution depends on a use case.
For me, if it is a one-off project, the most straight-forward "binary dump" method would be to reconsider your basic datatypes and store everything compactly, using a fixed-size structures.
struct FuntionMango
{
int NumParams; // valid items in Param/Return arrays
int NumReturns;
ValType ParamTypes[MAX_PARAMS];
ValType ReturnTypes[MAX_RETURNS];
};
struct MangoType
{
int NumContent; // valid items in Content array
// Fixed array instead of vector<FuntionMango>
FuntionMango Content[MAX_FUNCTIONS];
};
struct Mango // all fields are just 'public'
{
MangoType typeMan;
};
Then the "save" procedure would be
void saveMango(const char* filename, Mango* mango)
{
FILE* OutFile = fopen(...);
fwrite(mango, 1, sizeof(Mango), OutFile);
fclose(OutFile);
}
and load just uses "fread" (of course, all error handling and file integrity checking is omitted)
void loadMango(const char* filename, Mango* mango)
{
FILE* InFile = fopen(...);
fread(mango, 1, sizeof(Mango), InFile);
fclose(InFile);
}
To convert you Mango
into a byte array, just use a reinterpret_cast or a C-style cast.
Unfortunately, this approach would fail if any of your structures either contains pointer fields or has non-trivial constructors/destructors.
[EDIT (on request)]
Conversion to a byte array (filling an std::vector<uint8_t>
) can be done by using standard constructor of std::vector
Mango mango;
uint8_t* rawPointer = reinterpret_cast<uint8_t*>(&mango);
std::vector<uint8_t> byteArray(rawPointer, rawPointer + sizeof(Mango));
And vice versa, convert byte array to Mango
Mango otherMango;
uint8_t* rawPointer2 = reinterpret_cast<uint8_t*>(&otherMango);
memcpy(rawPointer2, byteArray.data(), sizeof(Mango));
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With