Ensuring that Eigen uses AVX vectorization for a certain operation

Tags:

I've written vectorized versions of some functions that are currently the bottleneck of an algorithm, using Eigen's facilities to do so.

I've also checked that AVX is enabled by making sure that EIGEN_VECTORIZE_AVX is defined after including Eigen.

However, it seems that my function never gets called with Packet8f (AVX), if the data size is not a multiple of 8. Instead, it gets called with Packet4f (SSE).

Here is a small repro: https://gist.github.com/bitonic/e89561cb21837b4dee8b5f49e1303919 . Here I define an operation using Packet4f and Packet8f, and then count how many times each gets called with an array of size 8 and 9. When the array is of size 8, the Packet8f version gets called once, as expected. When it's of size 9, the Packet4f version gets called twice instead, plus a single call to the non-vectorized version. I've tested this code on Eigen's current master 1d0c45122a5c4c5c1c4309f904120e551bacad02.

I've dug a bit and I believe that packet selection is happening here: https://gitlab.com/libeigen/eigen/blob/1d0c45122a5c4c5c1c4309f904120e551bacad02/Eigen/src/Core/util/XprHelper.h#L197 .

If I understand correctly, if the size of the data is not dynamic and not a multiple of 8 (that's the value of unpacket_traits<Packet8f>::size), the half-packet will be selected, which matches what the reproduction above shows.

If my understanding is correct, why is that the case? Shouldn't the full packet be selected, with the remaining elements working with the non-vectorized operation?

Could it be that that condition is wrong, and should be a >= comparison instead, e.g. something like

template<int Size, typename PacketType,
         bool Stop = Size==Dynamic || Size >= unpacket_traits<PacketType>::size || is_same<PacketType,typename unpacket_traits<PacketType>::half>::value>
struct find_best_packet_helper;

instead of

template<int Size, typename PacketType,
         bool Stop = Size==Dynamic || (Size%unpacket_traits<PacketType>::size)==0 || is_same<PacketType,typename unpacket_traits<PacketType>::half>::value>
struct find_best_packet_helper;

I've verified that with the fix above the problem goes away.

However I might be misunderstanding what is going on here, since I'm not very well versed in Eigen internals.

558

asked Jan 12 '20 23:01

bitonic

1 Answers

I have confirmed that this is due to how Eigen selects the packet type, see https://gitlab.com/libeigen/eigen/merge_requests/46 for a fix.

answered Sep 21 '22 06:09

bitonic

Related questions
                            
                                Why does Q_OBJECT break QDoc?
                            
                                Return rvalue reference or temporary object in C++11 [duplicate]
                            
                                What are the possible error handling strategies using Exceptions in C++, and what are their consequences and implications?
                            
                                Unable to use aligned `operator new` in a module with Clang
                            
                                Template parameter dependant [[nodiscard]]
                            
                                fatal error: google/protobuf/port_def.inc: No such file or directory #include <google/protobuf/port_def.inc>
                            
                                combination of enable_if + std::less + sizeof... makes MSVC fail
                            
                                Non-recursive enumeration of triply restricted positive integer compositions
                            
                                Why is template name available in derived class (the base class is an instance of the template)?
                            
                                Argument-dependent lookup of dependent names
                            
                                Objects storing data and objects storing smart pointers to data
                            
                                global initialization order with constexpr
                            
                                Self-/circular reference in "using" statements
                            
                                consteval function returning object with non-trivial constexpr destructor
                            
                                How to perform deep copy with cv::dnn::Net?
                            
                                What is the correct way to implement iterator and const_iterator in C++17?
                            
                                Buffer is overrun?
                            
                                Unsequenced std::find, and std::any_of by value
                            
                                Compiler error in ndk and clang++ for ARM?
                            
                                Smart pointer which can change ownership at runtime (C++)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Ensuring that Eigen uses AVX vectorization for a certain operation

Tags:

c++

vectorization

avx

simd

eigen

bitonic

People also ask

1 Answers

bitonic

Recent Activity

Donate For Us