Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What c++11 paradigm should I use to minimize memory-usage and minimize copying?

PROBLEM

I have an abstract interface Series and a concrete class Primary_Series which satisfies the interface by storing a large std::vector<> of values.

I also have another concrete class Derived_Series which is essentially a transform of the Primary_Series (eg some large Primary_Series multiplied by 3), which I want to be space-efficient, so I do not want to store the entire derived series as a member.

template<typename T>
struct Series
{
  virtual std::vector<T> const& ref() const = 0;
};

template<typename T>
class Primary_Series : Series<T>
{
  std::vector<T>  m_data;
public:
  virtual std::vector<T> const& ref() const override { return m_data; }
}

template<typename T>
class Derived_Series : Series<T>
{
  // how to implement ref() ?
}

QUESTION

How should I change this interface/pure-virtual method?

I don't want to return that vector by value because it would introduce unnecessary copying for Primary_Series, but in the Derived_Series case, I definitely need to create some kind of temporary vector. But then I am faced with the issue of how do I make that vector go away once the caller is done with it.

It would be nice if ref() would return a reference to a temporary that goes away as the reference goes away.

Does this mean I should use some kind of std::weak_ptr<>? Would this fit with how Primary_Series works?

What is the best approach to satisfy the "minimize memory usage" and "minimize copying" requirements, including making the Derived_Series temporary go away once the caller is done?

like image 979
kfmfe04 Avatar asked Feb 23 '14 19:02

kfmfe04


2 Answers

Well the interface design as it is poses a bit of a problem, because C++ doesn't really do lazy.

Now, since Derived_Series is supposed to be a lazily-evaluated (because you want to be space-efficient) transform of the original Primary_Series, you cannot return a reference of a full, fat vector. (Because that would require you to construct it first.)

So we have to change the interface and the way the _Series share data. Use std::shared_ptr<std::vector<>> to share the data between the Primary_Series and Derived_Series, so that Primary_Series going out of scope cannot invalidate data for your transform.

Then you can change your interface to be more "vector-like". That is, implement some (or all) of the usual data-accessing functions (operator[], at()...) and/or custom iterators, that return transformed values from the original series. These will let you hide some of the implementation details (laziness of the transform, sharing of data...) and still be able to return transformed values with maximal efficiency and let people use your class as a "vector-like", so you don't have to change much of your design. (~Any algo that uses vector will be able to use your class after being made aware of it.)

I've also sketched out a very basic example of what I mean.

(Note: If you have a multithreaded design and mutable Primary_Series, you will have to think a bit about where and what you need synchronized.)

---edit---
After mulling it over a bit more, I also have to note that the implementation for Derived_Series will be kinda painful anyway. It's methods will have to return by value and its iterators will basically be input iterators masquerading as higher class of iterators, because return by reference for lazily evaluated values doesn't really work, or it will have to fill in it's own data structure, as the positions for the original series is evaluated, which will bring with it completely different set of tradeoffs.

like image 100
Xarn Avatar answered Sep 21 '22 08:09

Xarn


One solution is to use a std::shared_ptr<vector<T> > to store the vector in your base class, and use that to return the value of the vector. The base class just returns its member value, and the derived class creates a new vector and returns that via a shared_ptr. Then when the caller doesn't need the returned value any more for the derived class, it will be automatically destroyed.

Alternatively, you can design your class to mimic the interface of an std::vector<T>, but design the base class so it returns the transformed values instead of the regular values. That way, no return is ever necessary. If you don't want to write methods for all of the functions a std::vector<T> has, you could just make some sort of transforming iterator that can iterate over and transform a std::vector<T>. Then you don't even have to have a complicated class hierarchy.

like image 23
MasterGeek Avatar answered Sep 22 '22 08:09

MasterGeek