Additionally, because the intermediate results are always written out
it is possible to service Precise Interrupts without affecting latency
(a common limitation of Vector ISAs implementing explicit
-Parallel Reduction instructions).
+Parallel Reduction instructions, because their Architectural State cannot
+hold the partial results).
## Basic principle