Given a register of 4 bytes (or 16 for SIMD), there has to be an efficient way to sort the bytes in-register with a few instructions.
Thanks in advance.
But since it has the upper hand in the average cases for most inputs, Quicksort is generally considered the “fastest” sorting algorithm.
Quicksort. Quicksort is one of the most efficient sorting algorithms, and this makes of it one of the most used as well. The first thing to do is to select a pivot number, this number will separate the data, on its left are the numbers smaller than it and the greater numbers on the right.
If all your parameters are all integers and if you have over 1024 input parameters, then radix sort is always faster.
The Radix sort algorithm works by ordering each digit from least significant to most significant. In base 10, radix sort would sort by the digits in the one's place, then the ten's place, and so on. To sort the values in each digit place, Radix sort employs counting sort as a subroutine.
Found it! It's in the 2007 paper "Using SIMD Registers and Instructions to Enable Instruction-Level Parallelism in Sorting Algorithms" by Furtak, Amaral, and Niewiadomski. Section 4.
It uses 4 SSE registers, has 12 steps, and runs in 19 instructions including load and store.
The same paper has some excellent work on dynamically making sorting networks with SIMD.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With