Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Guaranteed Detection of Temporary->Named Points

Suppose you write a matrix class with some operations:

class matrix
{
public:
    double operator()(size_t i, size_t j) const;
    ...
};

matrix operator*(const matrix &lhs, const matrix &rhs);
...

It makes sense to defer the evaluation of some matrix expressions: m0 * m1 * m2 * m3 * m4 (which is a series of four operator* calls) can benefit from using the dynamic-programming matrix chain multiplication algorithm; the very common m0 * m1t has a very efficient dgemm implementation, and so forth.

It consequently pays to defer the actual computation until when it's needed. This changes the above to:

class matrix
{
private:
    /*
    * Pointer to an abstract base class - either an actual matrix, 
    *    or an expression tree. */
    std::shared_ptr<matrix_imp> m_imp;

public:
    // Forces compaction - 
    double operator()(size_t i, size_t j) const;
    ...
};

/* Lazy; creates a matrix with an expression tree using the
*    internals of lhs and rhs. */
matrix operator*(const matrix &lhs, const matrix &rhs);
...

Each matrix holds a pointer to a base class object that can range from a real matrix to a complicated expression tree. Each operation tries to form a matrix using the laziest change to the internal implementation. Some operations have no choice but to actually evaluate things, compact the expression tree, and set the internal implementation to an actual matrix.


The problem was that, in practice, this caused huge memory overhead in very common cases. Let's say you read from a file a long-and-narrow matrix x = xp X q, p >> q, store xt x in a variable, and discard x. With lazy evaluation, the memory is pq >> qq. Load these in a loop, and it's a serious problem. (Of course, compaction can be forced after each load by the client code calling operator(), but requiring this without algorithmic justification is ugly and error prone.)

Initially, I thought that the move ctor was a good point for automatic compaction - it's exactly the point where a temporary becomes a named object, and it's named objects that cause increasing memory consumption, so

matrix(matrix &&other); // <- Force compaction only here

would appear to solve everything, e.g.,

auto res = // <- temp becoming named
    a * // temp
    b * // temp
    c + // temp
    2 * // temp
    d;

but can it be counted on? E.g., consider

matrix load_xtx(const string &f_name)
{
    matrix x = ...
    return x.t() * x; 
}

auto xtx = load_xtx("foo.hdf5"); // (*)

is the compiler forbidden to do in (*) something similar to what it does with the NRVO, namely just to construct it in place? Even if not, might a compiler optimize away things in other cases?

like image 384
Ami Tavory Avatar asked Jun 02 '15 11:06

Ami Tavory


1 Answers

Since the "internal pointer" method cannot give all the flexibility needed for the deferred evaluation, the typical solution used by C++ numerical libraries is to define specialized classes implementing lazy evaluation mechanisms. The old SO question Lazy evaluation in C++ and its best answers show the basics of such design and some sample code.

Although I am not an expert, I think good examples of this architecture are the numerical libraries Eigen (here some details about its implementation) and Blitz++, which heavily relies on templates (I did not find on the web updated documentation illustrating its internals, but this article describes some part of its engine and also provides a broader overview of the "expression template" technique).

like image 113
Alberto M Avatar answered Oct 05 '22 23:10

Alberto M