Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Serializing function objects

Is it possible to serialize and deserialize a std::function, a function object, or a closure in general in C++? How? Does C++11 facilitate this? Is there any library support available for such a task (e.g., in Boost)?

For example, suppose a C++ program has a std::function which is needed to be communicated (say via a TCP/IP socket) to another C++ program residing on another machine. What do you suggest in such a scenario?


Edit:

To clarify, the functions which are to be moved are supposed to be pure and side-effect-free. So I do not have security or state-mismatch problems.

A solution to the problem is to build a small embedded domain specific language and serialize its abstract syntax tree. I was hoping that I could find some language/library support for moving a machine-independent representation of functions instead.

like image 633
shaniaki Avatar asked Sep 09 '12 10:09

shaniaki


2 Answers

Yes for function pointers and closures. Not for std::function.

A function pointer is the simplest — it is just a pointer like any other so you can just read it as bytes:

template <typename _Res, typename... _Args>
std::string serialize(_Res (*fn_ptr)(_Args...)) {
  return std::string(reinterpret_cast<const char*>(&fn_ptr), sizeof(fn_ptr));
}

template <typename _Res, typename... _Args>
_Res (*deserialize(std::string str))(_Args...) {
  return *reinterpret_cast<_Res (**)(_Args...)>(const_cast<char*>(str.c_str()));
}                   

But I was surprised to find that even without recompilation the address of a function will change on every invocation of the program. Not very useful if you want to transmit the address. This is due to ASLR, which you can turn off on Linux by starting your_program with setarch $(uname -m) -LR your_program.

Now you can send the function pointer to a different machine running the same program, and call it! (This does not involve transmitting executable code. But unless you are generating executable code at run-time, I don't think you are looking for that.)

A lambda function is quite different.

std::function<int(int)> addN(int N) {
  auto f = [=](int x){ return x + N; };
  return f;
}

The value of f will be the captured int N. Its representation in memory is the same as an int! The compiler generates an unnamed class for the lambda, of which f is an instance. This class has operator() overloaded with our code.

The class being unnamed presents a problem for serialization. It also presents a problem for returning lambda functions from functions. The latter problem is solved by std::function.

std::function as far as I understand is implemented by creating a templated wrapper class which effectively holds a reference to the unnamed class behind the lambda function through the template type parameter. (This is _Function_handler in functional.) std::function takes a function pointer to a static method (_M_invoke) of this wrapper class and stores that plus the closure value.

Unfortunately, everything is buried in private members and the size of the closure value is not stored. (It does not need to, because the lambda function knows its size.)

So std::function does not lend itself to serialization, but works well as a blueprint. I followed what it does, simplified it a lot (I only wanted to serialize lambdas, not the myriad other callable things), saved the size of the closure value in a size_t, and added methods for (de)serialization. It works!

like image 165
Daniel Darabos Avatar answered Oct 10 '22 00:10

Daniel Darabos


No.

C++ has no built-in support for serialization and was never conceived with the idea of transmitting code from one process to another, lest one machine to another. Languages that may do so generally feature both an IR (intermediate representation of the code that is machine independent) and reflection.

So you are left with writing yourself a protocol for transmitting the actions you want, and the DSL approach is certainly workable... depending on the variety of tasks you wish to perform and the need for performance.

Another solution would be to go with an existing language. For example the Redis NoSQL database embeds a LUA engine and may execute LUA scripts, you could do the same and transmit LUA scripts on the network.

like image 41
Matthieu M. Avatar answered Oct 10 '22 00:10

Matthieu M.