From 711493322bc01067a8ae2e17b06524a354eb4a3b Mon Sep 17 00:00:00 2001 From: lkcl Date: Sat, 25 Mar 2023 19:58:08 +0000 Subject: [PATCH] --- openpower/sv/rfc/ls008.mdwn | 106 ++++++++++++++++++++++++++++++++++++ 1 file changed, 106 insertions(+) diff --git a/openpower/sv/rfc/ls008.mdwn b/openpower/sv/rfc/ls008.mdwn index a5cda55ea..8102fdb51 100644 --- a/openpower/sv/rfc/ls008.mdwn +++ b/openpower/sv/rfc/ls008.mdwn @@ -203,6 +203,112 @@ Notes: * Setting srcstep, dststep to 64 or greater, or VL or MVL to greater than 64 is reserved and will cause an illegal instruction trap. +**SVSTATE Fields** + +SVSTATE is a standard SPR that (if REMAP is not activated) contains sufficient +self-contaned information for a full context save/restore. +SVSTATE contains (and permits setting of): + +* MVL (the Maximum Vector Length) - declares (statically) how + much of a regfile is to be reserved for Vector elements +* VL - Vector Length +* dststep - the destination element offset of the current parallel + instruction being executed +* srcstep - for twin-predication, the source element offset as well. +* ssubstep - the source subvector element offset of the current + parallel instruction being executed +* dsubstep - the destination subvector element offset of the current + parallel instruction being executed +* vfirst - Vertical First mode. srcstep, dststep and substep + **do not advance** unless explicitly requested to do so with + pseudo-op svstep (a mode of setvl) +* RMpst - REMAP persistence. REMAP will apply only to the following + instruction unless this bit is set, in which case REMAP "persists". + Reset (cleared) on use of the `setvl` instruction if used to + alter VL or MVL. +* Pack - if set then srcstep/substep VL/SUBVL loop-ordering is inverted. +* UnPack - if set then dststep/substep VL/SUBVL loop-ordering is inverted. +* hphint - Horizontal Parallelism Hint. Indicates that + no Hazards exist between this number of sequentially-accessed + elements (including after REMAP). In Vertical First Mode + hardware **MUST** perform this many elements in parallel + per instruction. Set to zero to indicate "no hint". +* SVme - REMAP enable bits, indicating which register is to be + REMAPed. RA, RB, RC, RT or EA. +* mi0-mi4 - when the corresponding SVme bit is enabled, mi0-mi4 + indicate the SVSHAPE (0-3) that the corresponding register (RA etc) + should use. + +Programmer's Note: when REMAP is activated it becomes necessary on any +context-switch (Interrupt or Function call) to detect (or know in advance) +that REMAP is enabled and to additionally save/restore the four SVSHAPE +SPRs, SVHAPE0-3. Given that this is expected to be a rare occurrence it was +deemed unreasonable to burden every context-switch or function call with +mandatory save/restore of SVSHAPEs, and consequently it is a *callee* +(and Trap Handler) responsibility. Callees (and Trap Handlers) **MUST** +avoid using all and any SVP64 instructions during the period where state +could be adversely affected. SVP64 purely relies on Scalar instructions, +so Scalar instructions (except the SVP64 Management ones and mtspr and +mfspr) are 100% guaranteed to have zero impact on SVP64 state. + +**Max Vector Length (maxvl)** + +MAXVECTORLENGTH is the same concept as MVL in RISC-V RVV, except that it +is variable length and may be dynamically set. MVL is limited to 7 bits +(in the first version of SVP64) and consequently the maximum number of +elements is limited to between 0 and 127. + +Programmer's Note: Except by directly using `mtspr` on SVSTATE, which may +result in performance penalties on some hardware implementations, SVSTATE's `maxvl` +field may only be set **statically** as an immediate, by the `setvl` instruction. +It may **NOT** be set dynamically from a register. Compiler writers and assembly +programmers are expected to perform static register file analysis, subdivision, +and allocation and only utilise `setvl`. Direct writing to SVSTATE in order to +"bypass" this Note could, in less-advanced implementations, potentially cause stalling. + +**Vector Length (vl)** + +`setvl` is conceptually similar but different from the Cray, SX Aurora, and RISC-V RVV +equivalent. Similar to RVV, VL is set to be within +the range 0 <= VL <= MVL + + VL = rd = MIN(vlen, MVL) + +where 0 <= MVL <= XLEN + +**SUBVL - Sub Vector Length** + +This is a "group by quantity" that effectively asks each iteration +of the hardware loop to load SUBVL elements of width elwidth at a +time. Effectively, SUBVL is like a SIMD multiplier: instead of just 1 +operation issued, SUBVL operations are issued. + +The main effect of SUBVL is that predication bits are applied per +**group**, rather than by individual element. Legal values are 0 to 3, +representing 1 operation thru 4 operations respectively. + +**Horizontal Parallelism** + +A problem exists for hardware where it may not be able to detect +that a programmer (or compiler) knows of opportunities for parallelism +and lack of overlap between loops. + +For hphint, the number chosen must be consistently +executed **every time**. Hardware is not permitted to execute five +computations for one instruction then three on the next. +hphint is a hint from the compiler to hardware that exactly this +many elements may be safely executed in parallel, without hazards +(including Memory accesses). +Interestingly, when hphint is set equal to VL, it is in effect +as if Vertical First mode were not set, because the hardware is +given the option to run through all elements in an instruction. +This is exactly what Horizontal-First is: a for-loop from 0 to VL-1 +except that the hardware may *choose* the number of elements. + +*Note to programmers: changing VL during the middle of such modes +should be done only with due care and respect for the fact that SVSTATE +has exactly the same peer-level status as a Program Counter.* + ------------- \newpage{} -- 2.30.2