From 4d7906b751126166f4c5291ff18c3ca862111c8c Mon Sep 17 00:00:00 2001 From: lkcl Date: Mon, 27 Mar 2023 11:46:03 +0100 Subject: [PATCH] --- openpower/sv/rfc/ls009.mdwn | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/openpower/sv/rfc/ls009.mdwn b/openpower/sv/rfc/ls009.mdwn index d4a10b97e..44af23b15 100644 --- a/openpower/sv/rfc/ls009.mdwn +++ b/openpower/sv/rfc/ls009.mdwn @@ -362,7 +362,7 @@ the deterministic schedule programmers may find uses for the intermediate results. When Rc=1 a corresponding Vector of co-resultant CRs is also -created. No special action is taken: the result and its CR Field +created. No special action is taken: the result *and its CR Field* are stored "as usual" exactly as all other SVP64 Rc=1 operations. Note that the Schedule only makes sense on top of certain instructions: @@ -371,7 +371,9 @@ and the destination are all the same type. Like Scalar Reduction, nothing is prohibited: the results of execution on an unsuitable instruction may simply not make sense. With care, even 3-input instructions (madd, fmadd, ternlogi) -may be used. +may be used, and whilst it is down to the Programmer to walk through the +process the Programmer can be confident that the Parallel-Reduction is +guaranteed 100% Deterministic. Critical to note regarding use of Parallel-Reduction REMAP is that, exactly as with all REMAP Modes, the `svshape` instruction *requests* @@ -472,11 +474,12 @@ quantity at the same level of MSR and PC this is not a problem. The problems come when REMAP is enabled. Indexed REMAP must instead use `MAXVL` as the earliest (simplest) -batch-level Hazard Reservation indicator, +batch-level Hazard Reservation indicator (after taking element-width +overriding on the Index source into consideration), but Matrix, FFT and Parallel Reduction must all use completely different schemes. The reason is that VL is used to step through the total -number of *operations*, not the number of registers. The "Saving Grace" -is that all of the REMAP Schedules are Deterministic. +number of *operations*, not the number of registers. +The "Saving Grace" is that all of the REMAP Schedules are 100% Deterministic. Advance-notice Parallel computation and subsequent cacheing of all of these complex Deterministic REMAP Schedules is @@ -489,6 +492,12 @@ In short, there exists solutions to the problem of Hazard Management, with varying degrees of refinement possible at correspondingly increasing levels of complexity in hardware. +A reminder: when Rc=1 each result register (element) has an associated +co-result CR Field (one per result element). Thus above when determining +the Write-Hazards for result registers the corresponding Write-Hazards for the +corresponding associated co-result CR Field must not be forgotten, *including* when +Predication is used. + ## REMAP area of SVSTATE SPR The following bits of the SVSTATE SPR are used for REMAP: -- 2.30.2