What are some general tips/pointers on vectorizing tree operations? Memory layout wise, algorithm wise, etc.
Some domain specific stuff:
Beware, this is very hard to implement. Last year a team of Intel, Oracle and UCSC presented an amazing solution "FAST: Fast Architecture Sensitive Tree Search on Modern CPUs and GPUs". They won the "Best Paper Award 2010" by ACM SIGMOD.
Because of the random nature of trees it's not immediately obvious how vectorizing walks would be a big plus to you.
I would lay the tree out as a flat array of (parentid, node data) "node" items, sorted by parentid, so you can at least visit the children of a node together. Of course this doesn't give you much if your tree isn't "fat" (ie low number of children on average for a node).
Your best bet though is really just to emphasize on the brute force of SIMD, because you really can't do fancy random jumps through your list with this API.
Edit: I wouldn't throw out the normal tree class you most likely have though, implement the SIMD way and see if you really gain anything, I'm not convinced you will...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With