Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Aarch64 what is late-forwarding?

"Late-forwarding" is mentioned in "Arm Neoverse E1 Core Software Optimization Guide" (as well as in their optimization guides for some other CPU models):

Instruction Group Instructions Exec Latency Exec Throughput Notes
Multiply accumulate (32-bit) MADD, MSUB 3 (2) 1 2
Multiply accumulate (64-bit) MADD, MSUB 5 (4) 1/3 2

(2) Multiply-accumulate pipelines support late-forwarding of accumulate operands from similar μOPs, allowing a typical sequence of multiply-accumulate μOPs to issue one every N cycles (accumulate latency N shown in parentheses).

What does the term "late-forwarding" mean? What sequence of instructions would be subject to late-forwarding (counter-example would also be helpful)?

like image 712
stepan Avatar asked Feb 15 '21 17:02

stepan


1 Answers

Late forwarding for multiply-add operations means that the addend can be made available after the multiplication has completed rather than having to be available when the multiply-add operation begins execution. Since the multiplication itself is not data dependent on the addend, it can proceed. Since some work for the addition can be done in parallel with the multiplication (the exponent of the product will be available early and can be used with the addend's exponent to determine the amount of shift needed before addition), one may want the addend to be available before the entire product is available, but even in that case the addend is not needed until much later than the multiplicands.

By delaying the forwarding (availability) of the addend, the effective latency of dependent accumulations is reduced. This reduces the number of accumulation registers (and parallelism) one needs to cover the latency.

like image 113
Paul A. Clayton Avatar answered Sep 26 '22 23:09

Paul A. Clayton