From 4ec6c482614a152bcaf8898f34a31d59ca25b46e Mon Sep 17 00:00:00 2001 From: lkcl Date: Sun, 26 Mar 2023 22:28:00 +0100 Subject: [PATCH] --- openpower/sv/rfc/ls009.mdwn | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/openpower/sv/rfc/ls009.mdwn b/openpower/sv/rfc/ls009.mdwn index f81958d54..50aecd061 100644 --- a/openpower/sv/rfc/ls009.mdwn +++ b/openpower/sv/rfc/ls009.mdwn @@ -97,16 +97,18 @@ arbitrary access to elements (when elwidth overrides are used), independently on each Vector src or dest register. Aside from Indexed REMAP this is entirely Hardware-accelerated reordering and consequently not costly in terms of register access. It -will however place a burden on Multi-Issue systems but no more than if - -exactly as if - the equivalent Scalar instructions were explicitly -loop-unrolled without SVP64. +will however place a burden on Multi-Issue systems but no more than if +the equivalent Scalar instructions were explicitly +loop-unrolled without SVP64, and some advanced implementations may even find +the Deterministic nature of the Scheduling to be easier on resources. -The initial primary motivation of REMAP was for Matrix Multiplication, reordering of sequential -data in-place: in-place DCT and FFT were easily justified given the +The initial primary motivation of REMAP was for Matrix Multiplication, reordering +of sequential data in-place: in-place DCT and FFT were easily justified given the high usage in Computer Science. Four SPRs are provided which may be applied to any GPR, FPR or CR Field so that for example a single FMAC may be -used in a single loop to perform 5x3 times 3x4 Matrix multiplication, +used in a single hardware-controlled 100% Deterministic loop to +perform 5x3 times 3x4 Matrix multiplication, generating 60 FMACs *without needing explicit assembler unrolling*. Additional uses include regular "Structure Packing" such as RGB pixel data extraction and reforming. -- 2.30.2