Why dont use the AVX Registers as a ultra fast cache?

Question

I've been wondering why the 16x256 Bit Registers provided by AVX2 aren't getting used for storing normal registers when AVX cant help - to minimize the hitting of cache's for in situations where u just don't happen to have enough registers at hand. IsnÄ't it like that you can set and access AVX Registers in 1-2 Cycles?

All this wouldn't work of course if you're screwing up other code running the AVX stuff and kick it out of the registers. I haven't seen this obvious approach getting used yet, which lead me to asking this question.

gsg · Accepted Answer

At one time, Intel indeed recommended spilling from general purpose to SSE registers in their optimization manual. (That's not AVX exactly, but it is the same idea.) I haven't looked at the very latest manuals, so that advice may or may not be out of date.

Spilling to xmm registers has the disadvantage that those registers are not preserved across function calls. Given that the x86-64 is a register-memory machine, accessing spilled values on the stack also requires fewer instructions and fewer registers (compare add rax, [rsp+k] to movq rbx, xmm0/add rax, rbx). That might go some way to explaining why there isn't much interest in the technique.

Why dont use the AVX Registers as a ultra fast cache?

Tags:

performance

assembly

avx

cpu-registers

sse

user1610743

Video Answer

1 Answers

gsg

Recent Activity

Donate For Us

Why dont use the AVX Registers as a ultra fast cache?

Tags:

performance

assembly

avx

cpu-registers

sse

user1610743

Video Answer

1 Answers

gsg

Related questions

Recent Activity

Donate For Us