Should I use pointers or move semantics for passing big chunks of data?

Tags:

I have a questions about recommended coding technique. I have a tool for model analysis and I sometimes need to pass a big amount of data (From a factory class to one that holds multiple heterogeneous chunks).

My question is whether there is some consensus about if I should rather use pointers or move the ownership (I need to avoid copying when possible as the size of a data-block may be as big as 1 GB).

The pointer version would look like this:

class FactoryClass {
...
public:
   static Data * createData() {
      Data * data = new Data;
      ...
      return data;
   }
};

class StorageClass {
   unique_ptr<Data> data_ptr;
...
public:
   void setData(Data * _data_ptr) {
      data_ptr.reset(_data_ptr);
   }
};

void pass() {
   Data * data = FactoryClass::createData();
   ...
   StorageClass storage;
   storage.setData(data);
}

Whereas the move version is like this:

class FactoryClass {
...
public:
   static Data createData() {
      Data data;
      ...
      return data;
   }
};

class StorageClass {
   Data data;
...
public:
   void setData(Data _data) {
      data = move(_data);
   }
};

void pass() {
   Data data = FactoryClass::createData();
   ...
   StorageClass storage;
   storage.setData(move(data));
}

I like the move version better - yes, I need to add move commands to the main code, but then I in the end have just the objects in the storage and I do not have to care about pointer semantics anymore.

However I am not quite relaxed when using the move semantics whom I do not understand in detail. (I do not care about the C++11 requirement though, as the code is already only Gcc4.7+ compilable).

Would someone have a reference that would support either version? Or is there some other, preferred version of how to pass data?

I was not able to Google anything as the keywords usually led to other topics.

Thanks.

EDIT NOTE: The second example got refactored to incorporate suggestions from the comments, the semantics remained unchanged.

621

asked Jul 29 '13 16:07

Adam Streck

1 Answers

When you are passing an object to a function, what you pass depends in part on how that function is going to use it. A function can use an object in one of three general ways:

It can simply reference the object for the duration of the function call, with the calling function (or it's eventual parent up the call stack) maintaining ownership of the object. The reference in this case may be a constant reference or a modifiable reference. The function will not store this object long-term.
It can copy the object directly. It doesn't gain ownership of the original, but it does acquire a copy of the original, so as to store, modify, or do with the copy what it will. Note that the difference between #1 and this is that the copy is made explicit in the parameter list. For example, taking a std::string by value. But this could also be as simple as taking an int by value.
It can gain some form of ownership of the object. The function then has some responsibility over the object's destruction. This also allows the function to store the object long-term.

My general recommendation for the parameter types for these paradigms are as follows:

Take the object by an explicit language reference where possible. If that's not possible, try a std::reference_wrapper. If that can't work, and no other solutions seem reasonable, then use a pointer. A pointer would be for things like optional parameters (though C++14's std::optional will make that less useful. Pointers will still have uses though), language arrays (though again, we have objects that cover most of the uses of these), and so forth.
Take the object by value. That one's pretty non-negotiable.
Take the object either by value-move (ie: move it into a by-value parameter) or by a smart-pointer to the object (which will also be taken by value, since you're going to copy/move it anyway). The problem with your code is that you're transferring ownership via a pointer, but with a raw pointer. Raw pointers have no ownership semantics. The moment you allocate any pointer, you should immediately wrap it in some kind of smart pointer. So your factory function should have returned a unique_ptr.

Your case appears to be #3. Which you use between value-move and smart pointer is entirely up to you. If you have to heap allocate Data for some reason, then the choice is pretty much made for you. If Data can be stack allocated, then you have some options.

I would generally do this based on an estimation of Data's internal size. If internally, it's just a few pointers/integers (and by "few", I mean like 3-4), then putting it on the stack is fine.

Indeed, it can better because you'll have less chance of a double-cache-miss. If your Data functions often just access data from another pointer, if you store Data by pointer, then every function call on it will have to dereference your stored pointer to fetch the internal one, then dereference the internal one. That's two potential cache misses, since neither pointer has any locality with StorageClass.

If you store Data by value, it's much more likely that Data's internal pointer will already be in the cache. It has better locality with StorageClass's other members; if you accessed some of StorageClass before now, you already paid for a cache miss, so you are likely to already have Data in the cache.

But movement is not free. It's cheaper than a full copy, but it's not free. You're still copying the internal data (and possibly nulling out any pointers on the original). But then again, allocating memory on the heap isn't free either. Nor is deallocating it.

But then again, if you're not moving it around very often (you move it around to get it to its final location, but little more after that), even moving a larger object would be fine. If you're using it more than you're moving it, then the cache locality of the object's storage will probably win out over the cost of moving.

There ultimately aren't a lot of technical reasons to pick one or the other. I would say to default to movement where reasonable.

103

answered Sep 27 '22 17:09

Nicol Bolas

Related questions
                            
                                Hierarchical Enums in C++
                            
                                Finding the most common three-item sequence in a very large file
                            
                                Find screen position of a QGraphicsItem
                            
                                Existing implementations for NIST SP 800-56A Concatenation/Single-Step Key Derivation Function? [closed]
                            
                                was a library built with libc++ or libstdc++ on Mac
                            
                                On implementing std::swap in terms of move assignment and move constructor
                            
                                Intellisense "Toggle Completion Mode" doesn't work with C++ in Visual Studio 2010 Professional
                            
                                smart pointers not working with Android NDK r8
                            
                                Is my book's discussion of lambda return types wrong?
                            
                                When is the move constructor called in the `std::move()` function?
                            
                                How to detect a string literal with type_traits?
                            
                                std::chrono: add custom duration to time_point
                            
                                __func__ C++11 function's local predefined variable, won't compile
                            
                                EOF in async_read() in boost::asio
                            
                                How big is the stack memory for a certain program, and are there any compiler flags that can set it?
                            
                                Passing a pointer of inaccessible private base type to the derived class method
                            
                                Gcc 4.8 DWARF4 vs DWARF2
                            
                                C++ Get Handle of Open Sockets of a Program
                            
                                C++ reset locale to "C" globally?
                            
                                User-declared default constructor + in-class initializers != user-provided constructor? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Should I use pointers or move semantics for passing big chunks of data?

Tags:

c++

pointers

c++11

move

Adam Streck

People also ask

1 Answers

Nicol Bolas

Recent Activity

Donate For Us