There is a class SomeClass
which holds some data and methods that operates on this data. And it must be created with some arguments like:
SomeClass(int some_val, float another_val);
There is another class, say Manager
, which includes SomeClass
, and heavily uses its methods.
So, what would be better in terms of performance (data locality, cache hits, etc.), declare object of SomeClass
as member of Manager
and use member initialization in Manager
's constructor or declare object of SomeClass as unique_ptr?
class Manager
{
public:
Manager() : some(5, 3.0f) {}
private:
SomeClass some;
};
or
class Manager
{
public:
Manager();
private:
std::unique_ptr<SomeClass> some;
}
Most likely, there is no difference in runtime efficiency of accessing your subobject. But using pointer can be slower for several reasons (see details below).
Moreover, there are several other things you should remember:
Speaking of compile times, pointer is better than plain member. With plain member, you cannot remove dependency of Manager
declaration on SomeClass
declaration. With pointers, you can do it with forward declaration. Less dependencies may result is less build times.
I'd like to provide more details about performance of subobject accesses. I think that using pointer can be slower than using plain member for several reasons:
Manager
and SomeClass
together, and plain member is guaranteed to be near other data, while heap allocations may place object and subobject far from each other.Here is an example for the last issue (full code is here):
struct IntValue {
int x;
IntValue(int x) : x(x) {}
};
class MyClass_Ptr {
unique_ptr<IntValue> a, b, c;
public:
void Compute() {
a->x += b->x + c->x;
b->x += a->x + c->x;
c->x += a->x + b->x;
}
};
Clearly, it is stupid to store subobjects a
, b
, c
by pointers. I've measured time spent in one billion calls of Compute
method for a single object. Here are results with different configurations:
2.3 sec: plain member (MinGW 5.1.0)
2.0 sec: plain member (MSVC 2013)
4.3 sec: unique_ptr (MinGW 5.1.0)
9.3 sec: unique_ptr (MSVC 2013)
When looking at the generated assembly for innermost loop in each case, it is easy to understand why the times are so different:
;;; plain member (GCC)
lea edx, [rcx+rax] ; well-optimized code: only additions on registers
add r8d, edx ; all 6 additions present (no CSE optimization)
lea edx, [r8+rax] ; ('lea' instruction is also addition BTW)
add ecx, edx
lea edx, [r8+rcx]
add eax, edx
sub r9d, 1
jne .L3
;;; plain member (MSVC)
add ecx, r8d ; well-optimized code: only additions on registers
add edx, ecx ; 5 additions instead of 6 due to a common subexpression eliminated
add ecx, edx
add r8d, edx
add r8d, ecx
dec r9
jne SHORT $LL6@main
;;; unique_ptr (GCC)
add eax, DWORD PTR [rcx] ; slow code: a lot of memory accesses
add eax, DWORD PTR [rdx] ; each addition loads value from memory
mov DWORD PTR [rdx], eax ; each sum is stored to memory
add eax, DWORD PTR [r8] ; compiler is afraid that some values may be at same address
add eax, DWORD PTR [rcx]
mov DWORD PTR [rcx], eax
add eax, DWORD PTR [rdx]
add eax, DWORD PTR [r8]
sub r9d, 1
mov DWORD PTR [r8], eax
jne .L4
;;; unique_ptr (MSVC)
mov r9, QWORD PTR [rbx] ; awful code: 15 loads, 3 stores
mov rcx, QWORD PTR [rbx+8] ; compiler thinks that values may share
mov rdx, QWORD PTR [rbx+16] ; same address with pointers to values!
mov r8d, DWORD PTR [rcx]
add r8d, DWORD PTR [rdx]
add DWORD PTR [r9], r8d
mov r8, QWORD PTR [rbx+8]
mov rcx, QWORD PTR [rbx] ; load value of 'a' pointer from memory
mov rax, QWORD PTR [rbx+16]
mov edx, DWORD PTR [rcx] ; load value of 'a->x' from memory
add edx, DWORD PTR [rax] ; add the 'c->x' value
add DWORD PTR [r8], edx ; add sum 'a->x + c->x' to 'b->x'
mov r9, QWORD PTR [rbx+16]
mov rax, QWORD PTR [rbx] ; load value of 'a' pointer again =)
mov rdx, QWORD PTR [rbx+8]
mov r8d, DWORD PTR [rax]
add r8d, DWORD PTR [rdx]
add DWORD PTR [r9], r8d
dec rsi
jne SHORT $LL3@main
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With