Fastest way to negate a std::vector

Tags:

Assume I have a std::vector of double, namely

std::vector<double> MyVec(N);

Where N is so big that performance matters. Now assume that MyVec is a nontrivial vector (i.e. it is not a vector of zeros, but has been modified by some routine). Now, I need the negated version of the vector: I need -MyVec.

So far, I have been implementing it via

Click to copy

std::transform(MyVec.cbegin(),MyVec.cend(),MyVec.begin(),std::negate<double>());

But, really, I do not know if this is something sensible or it is just super naïve from my side.

Am I doing it correctly? Or std::transform is just a super slow routine in this case?

PS: I am using BLAS and LAPACK libraries all the time, but I have not found anything that matches this particular need. However, if there exists such a function in BLAS/LAPACK which is faster than std::transform, I would be glad to know.

298

asked Nov 15 '17 14:11

enanone

2 Answers

Click to copy

#include <vector> #include <algorithm> #include <functional>  void check() {     std::vector<double> MyVec(255);     std::transform(MyVec.cbegin(),MyVec.cend(),MyVec.begin(),std::negate<double>()); }

This code on https://godbolt.org/ with copile option -O3 generate nice assembly

Click to copy

.L3: [...]   cmp r8, 254   je .L4   movsd xmm0, QWORD PTR [rdi+2032]   xorpd xmm0, XMMWORD PTR .LC0[rip]   movsd QWORD PTR [rdi+2032], xmm0 .L4:

It's difficult to imagine faster. Your code is already perfect, don't try to outsmart the compiler and use clean C++ code it works almost every times.

114

answered Sep 22 '22 01:09

ColdCat

Fortunately the data in std::vector is contiguous so you can multiply by -1 using vector intrinsics (using unaligned load/stores and special handing of the possible overflow). Or use ippsMulC_64f/ippsMulC_64f_I from intel's IPP library (you'll struggle to write something faster) which will use the largest vector registers available to your platform: https://software.intel.com/en-us/ipp-dev-reference-mulc

Update: to clear up some confusion in the comments, the full version of Intel IPP is free (although you can pay for support) and comes on Linux, Windows and macOS.

answered Sep 22 '22 01:09

keith

Related questions
                            
                                How to set Visual Studio Filters for nested sub directory using cmake
                            
                                What should I use instead of sscanf?
                            
                                C++ Boost ASIO simple periodic timer?
                            
                                std::advance behavior when advancing beyond end of container [duplicate]
                            
                                c++ class template specialization, without having to reimplement everything
                            
                                Sockets in MinGW
                            
                                Reading an image file in C/C++ [closed]
                            
                                g++ does not show a 'unused' warning
                            
                                Override a member function with different return type
                            
                                Qt Execute external program
                            
                                What does comma operator mean in a switch statement?
                            
                                How to draw a QR code with Qt in native C/C++
                            
                                Can Qt Creator recognize TODO and FIXME comments?
                            
                                What is the proper way to break on failed asserts in gdb?
                            
                                What is the default constructor for C++ pointer?
                            
                                Does std::string::assign takes "ownership" of the string?
                            
                                valgrind memory leak errors when using pthread_create
                            
                                What is the major difference between a vector and a stack?
                            
                                How to use Visual Studio C++ Compiler?
                            
                                Is it acceptable practice to unit-test a program in a different language?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Fastest way to negate a std::vector

Tags:

c++

optimization

stdvector

lapack

blas

enanone

People also ask

2 Answers

ColdCat

keith

Recent Activity

Donate For Us