Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Two phase construction at real time system

I am developing a real time system, and I am debating about the design of the classes.
To be specific, I can't decide whether to build the "heavy" classes by using two phase construction.

On the one hand, calling constructor of a "heavy" class can be a major bottle-neck at running time, and it saves me from creating classes and allocating memory of features that the user might won't use.

On the other hand, two phase construction can makes surprises during execution, considering a situation when the we try to access an ability, but we can't since it didn't initialize, and suddenly we need to fully build it before using.

My tendency is to go for a two phase construction method. What i like to hear is pros\cons for two phase construction at real time system. And if there is a better approach toward this.

Here a code example of a heavy class (my classes sure won't look like that, but it demonstrate my idea):

 class VeryHeavy {

 private:

    HeavyClass1* p1;
    HeavyClass2* p2;
    HeavyClass3* p3;
    HeavyClass4* p4;
    HeavyClass5* p5;

    int* hugeArray [100000];

    //...//

};
like image 422
Ran Eldan Avatar asked Jul 25 '13 10:07

Ran Eldan


3 Answers

enter image description here

This is the AGC, Apollo Guidance Computer, used both on the Apollo command module and the lunar module. Famous for almost causing the Apollo 11 landing to be scrubbed. Right in the middle of the descent to the Moon surface, this computer crashed on a real-time error. Several times. Producing System Error 1201 (Executive overflow - no vacant areas) and System Error 1202 (Executive overflow - no core sets). Armstrong and Aldrin only saw the number, the UI device you see on the right of the photo was too primitive to show strings. It was guidance controller Steve Bales that knew what the numbers meant (they had never seen the error while training) and knew that the system could recover from it. And saved the landing by giving the GO anyway, he got the Presidential Medal of Freedom for that.

This is probably what your question is asking about, although we can be pretty sure that you are not trying to land a rocket. The term "real time" used to be pretty well defined in software engineering but it got muddled by the financial industry. In Apollo 11 it meant a system that has a very hard upper limit on the maximum response time to external events. Rockets need a such a system, it can't be too late sometimes when adjusting the nozzle, being late once produces a billion dollar ball of fire. The financial industry hijacked it to mean a system that's arbitrarily fast, being late sometimes isn't going to vaporize the machine although it makes the odds for a trading loss greater. They probably consider that a disaster as well :)

The memory allocator you use matters a lot, also not defined in the question. Arbitrarily I'll assume your program is running on a demand-paged virtual memory operating system. Not exactly the ideal environment for a Real Time system but common enough, true real-time operating systems haven't fared well.

Two-phase construction is a technique used to deal with initialization failure, exceptions thrown in a constructor are difficult to deal with, the destructor will not run so that can cause a resource leak if you allocate in the constructor without otherwise making the constructor smart enough to deal with a mishap. The alternative is to do it later, inside a member function, lazily allocating as needed.

So what you worry about is that lazy allocation is going to hamper the responsiveness of the system. Producing System Error 1201.

This is not in fact a primary concern on a demand-paged virtual memory operating system like Linux or Windows. The memory allocator on these operating system is fast, it only allocates virtual memory. Which doesn't cost anything, it is virtual. The true cost comes later, when you actually start to use the allocated memory. Where the "demand" of demand-paged comes into play. Addressing an array element is going to produces a page fault, forcing the operating system to map the addressed virtual memory page into RAM. Such page faults are relatively cheap, called "soft" page faults, if the machine isn't otherwise under pressure and must unmap a page being used by another process to acquire the RAM. You'd expect the OS to be able to just grab a page and map it, overhead is measured in microseconds.

So in effect, if you do it right and don't try to initialize the entire array when you allocate it then your program will be subjected to tens of thousands of tiny needle pricks of overhead. Each single one small enough to not endanger a real-time response guarantee. This will happen regardless of whether you allocate the memory early or late, so whether you use two-phase construction doesn't matter.

If you want to guarantee that this doesn't happen either, or want to be resilient to a storm of page faults you get when you initialize the entire array, then you'll need a very different approach, you need to page-lock the RAM allocation so that the operating system cannot unmap the page. This invariably requires tinkering with the OS settings, it typically doesn't allow a process to page-lock large amounts of memory. Two-phase construction is then out of the door as well of course.

Do keep in mind that it is pretty rare for a program to know how to deal with allocation failure. They behave almost like asynchronous exceptions, ready to strike at any point in time in nearly any part of the program. Especially hard to reconcile with the real-time requirement, a system that has no response to a real-time event because it ran out of memory is of course no better than one that's late. That's still a ball of fire ;) So that in itself should be already enough reason to not bother with two-phase construction, just allocate the memory at program initialization time, before you start promising real-time response. It makes coding the program a lot simpler, the odds for failure are much lower.

A pretty hard requirement for any software that runs with real-time characteristics is that it won't have to fight with other processes to acquire operating system resources. Dedicating the entire machine to just one process is expected, you are not restricted to 36864 words of rope memory and 2048 words of RAM anymore like the AGC. Hardware is cheap and plentiful enough these days to provide such a guarantee.

like image 191
Hans Passant Avatar answered Oct 15 '22 20:10

Hans Passant


Hans Passant answer insightfully describes why you should try to not use lazy initialization under "real time" requirenments.

But if you really need "lazy", you should try to not put burden on class user and implementator to do repetitive if(!is_constructed) construct();.

First of all, consider cheap default construction, like std::vector has:

vector<int> x;

It constructs empty vector. And, for instance, you may safely call begin(x) and end(x) - in that sense object is valid, and constructed.

But, if your class really must do heavy work in constructor, and you want avoid it until first usage, then consider to make reusable non-intrusive lazy initializer - it would do initialization on first usage automatically, without forcing user and implementor to do boilerplate checks.

Here is possible usage:

struct Widget
{
    Widget(int x)
    {
        cout << "Widget(" << x << ")" << endl;
    }
    void foo()
    {
        cout << "Widget::foo()" << endl;
    }
};

int main()
{
    auto &&x = make_lazy<Widget>(11);
    cout << "after make_lazy" << endl;
    x->foo();
}

Output is:

after make_lazy
Widget(11)
Widget::foo()

LIVE DEMO:

#include <boost/utility/in_place_factory.hpp>
#include <boost/optional.hpp>
#include <iostream>
#include <utility>

using namespace boost;
using namespace std;

template<typename T, typename Factory>
class Lazy
{
    mutable optional<T> x;
    Factory f;

    T *constructed() const
    {
        if(!x) x = f;
        return &*x;
    }
public:
    Lazy(Factory &&f) : f(f) {}

    T *operator->()
    {
        return constructed();
    }
    const T *operator->() const
    {
        return constructed();
    }
};

template<typename T, typename ...Args>
auto make_lazy(Args&&... args) -> Lazy<T, decltype(in_place(forward<Args>(args)...))>
{
    return {in_place(forward<Args>(args)...)};
}

/*****************************************************/

struct Widget
{
    Widget(int x)
    {
        cout << "Widget(" << x << ")" << endl;
    }
    void foo()
    {
        cout << "Widget::foo()" << endl;
    }
};

int main()
{
    auto &&x = make_lazy<Widget>(11);
    cout << "after make_lazy" << endl;
    x->foo();
}
like image 2
Evgeny Panasyuk Avatar answered Oct 15 '22 21:10

Evgeny Panasyuk


Main "pro" for two-phase approach if we have 2 entities. First one provides interface IFirst and requires external ISecond implementation. Second one provides ISecond and requires IFirst in turn. Without two-phase init, this is "chicken and egg" unresolvalbe question.

According to heavy objects vs limited scope (like real-time/mobile/embedded), it may be worth to make objects as thin & lazy as possible. Potentially, it may be a caller responsibility to provide series of init calls before using some functionality just to make sure that everything is initialized right way before jumping onboard.

like image 1
Yury Schkatula Avatar answered Oct 15 '22 20:10

Yury Schkatula