# RFC ls008 SVP64 Management instructions
[[!tag opf_rfc]]
**URLs**:
*
*
*
*
**Severity**: Major
**Status**: New
**Date**: 24 Mar 2023
**Target**: v3.2B
**Source**: v3.0B
**Books and Section affected**:
```
Book I, new Scalar Chapter. (Or, new Book on "Zero-Overhead Loop Subsystem")
Appendix E Power ISA sorted by opcode
Appendix F Power ISA sorted by version
Appendix G Power ISA sorted by Compliancy Subset
Appendix H Power ISA sorted by mnemonic
```
**Summary**
```
Instructions added
setvl - Cray-style "Set Vector Length" instruction
svstep - Vertical-First Mode explicit Step and Status
svremap - Re-Mapping of Register Element Offsets
svindex - General-purpose setting of SHAPEs to be re-mapped
svshape - Hardware-level setting of SHAPEs for element re-mapping
svshape2 - Hardware-level setting of SHAPEs for element re-mapping (v2)
```
**Submitter**: Luke Leighton (Libre-SOC)
**Requester**: Libre-SOC
**Impact on processor**:
```
Addition of six new "Zero-Overhead-Loop-Control" DSP-style Vector-style
Management Instructions which can be implemented extremely efficiently
and effectively by inserting an additional phase between Decode and Issue.
More complex designs are NOT adversely impacted and in fact greatly benefit
whilst still retaining an obvious linear sequential execution programming model.
```
**Impact on software**:
```
Requires support for new instructions in assembler, debuggers,
and related tools.
```
**Keywords**:
```
Cray Supercomputing, Vectorisation, Zero-Overhead-Loop-Control,
Scalable Vectors, Multi-Issue Out-of-Order, Sequential Programming Model
```
**Motivation**
TODO
**Notes and Observations**:
1. TODO
**Changes**
Add the following entries to:
* the Appendices of Book I
* Instructions of Book I as a new Section
* SVL-Form of Book I Section 1.6.1.6 and 1.6.2
----------------
\newpage{}
# svstep: Vertical-First Stepping and status reporting
SVL-Form
* svstep RT,SVi,vf (Rc=0)
* svstep. RT,SVi,vf (Rc=1)
| 0-5|6-10|11.15|16..22| 23-25 | 26-30 |31| Form |
|----|----|-----|------|----------|-------|--|--------- |
|PO | RT | / | SVi | / / vf | XO |Rc| SVL-Form |
Pseudo-code:
```
if SVi[3:4] = 0b11 then
# store subvl, pack and unpack in SVSTATE
SVSTATE[53] <- SVi[5]
SVSTATE[54] <- SVi[6]
RT <- [0]*62 || SVSTATE[53:54]
else
step <- SVSTATE_NEXT(SVi, vf)
RT <- [0]*57 || step
```
Special Registers Altered:
CR0 (if Rc=1)
-------------
\newpage{}
# setvl
SVL-Form
* setvl RT,RA,SVi,vf,vs,ms (Rc=0)
* setvl. RT,RA,SVi,vf,vs,ms (Rc=1)
Pseudo-code:
overflow <- 0b0
VLimm <- SVi + 1
# set or get MVL
if ms = 1 then MVL <- VLimm[0:6]
else MVL <- SVSTATE[0:6]
# set or get VL
if vs = 0 then VL <- SVSTATE[7:13]
else if _RA != 0 then
if (RA) >u 0b1111111 then
VL <- 0b1111111
overflow <- 0b1
else VL <- (RA)[57:63]
else if _RT = 0 then VL <- VLimm[0:6]
else if CTR >u 0b1111111 then
VL <- 0b1111111
overflow <- 0b1
else VL <- CTR[57:63]
# limit VL to within MVL
if VL >u MVL then
overflow <- 0b1
VL <- MVL
SVSTATE[0:6] <- MVL
SVSTATE[7:13] <- VL
if _RT != 0 then
GPR(_RT) <- [0]*57 || VL
if ((¬vs) & ¬(ms)) = 0 then
# set requested Vertical-First mode, clear persist
SVSTATE[63] <- vf
SVSTATE[62] <- 0b0
Special Registers Altered:
CR0 (if Rc=1)
-------------
\newpage{}
# SVSTATE SPR
The format of the SVSTATE SPR is as follows:
| Field | Name | Description |
| ----- | -------- | --------------------- |
| 0:6 | maxvl | Max Vector Length |
| 7:13 | vl | Vector Length |
| 14:20 | srcstep | for srcstep = 0..VL-1 |
| 21:27 | dststep | for dststep = 0..VL-1 |
| 28:29 | dsubstep | for substep = 0..SUBVL-1 |
| 30:31 | ssubstep | for substep = 0..SUBVL-1 |
| 32:33 | mi0 | REMAP RA/FRA/BFA SVSHAPE0-3 |
| 34:35 | mi1 | REMAP RB/FRB/BFB SVSHAPE0-3 |
| 36:37 | mi2 | REMAP RC/FRT SVSHAPE0-3 |
| 38:39 | mo0 | REMAP RT/FRT/BF SVSHAPE0-3 |
| 40:41 | mo1 | REMAP EA/RS/FRS SVSHAPE0-3 |
| 42:46 | SVme | REMAP enable (RA-RT) |
| 47:52 | rsvd | reserved |
| 53 | pack | PACK (srcstrp reorder) |
| 54 | unpack | UNPACK (dststep order) |
| 55:61 | hphint | Horizontal Hint |
| 62 | RMpst | REMAP persistence |
| 63 | vfirst | Vertical First mode |
Notes:
* The entries are truncated to be within range. Attempts to set VL to
greater than MAXVL will truncate VL.
* Setting srcstep, dststep to 64 or greater, or VL or MVL to greater
than 64 is reserved and will cause an illegal instruction trap.
**SVSTATE Fields**
SVSTATE is a standard SPR that (if REMAP is not activated) contains sufficient
self-contaned information for a full context save/restore.
SVSTATE contains (and permits setting of):
* MVL (the Maximum Vector Length) - declares (statically) how
much of a regfile is to be reserved for Vector elements
* VL - Vector Length
* dststep - the destination element offset of the current parallel
instruction being executed
* srcstep - for twin-predication, the source element offset as well.
* ssubstep - the source subvector element offset of the current
parallel instruction being executed
* dsubstep - the destination subvector element offset of the current
parallel instruction being executed
* vfirst - Vertical First mode. srcstep, dststep and substep
**do not advance** unless explicitly requested to do so with
pseudo-op svstep (a mode of setvl)
* RMpst - REMAP persistence. REMAP will apply only to the following
instruction unless this bit is set, in which case REMAP "persists".
Reset (cleared) on use of the `setvl` instruction if used to
alter VL or MVL.
* Pack - if set then srcstep/substep VL/SUBVL loop-ordering is inverted.
* UnPack - if set then dststep/substep VL/SUBVL loop-ordering is inverted.
* hphint - Horizontal Parallelism Hint. Indicates that
no Hazards exist between this number of sequentially-accessed
elements (including after REMAP). In Vertical First Mode
hardware **MUST** perform this many elements in parallel
per instruction. Set to zero to indicate "no hint".
* SVme - REMAP enable bits, indicating which register is to be
REMAPed: RA, RB, RC, RT and EA are the canonical (typical) register names
associated with each bit, with RA being the LSB and EA being the MSB.
See table below for ordering. When `SVme` is zero (0b00000) REMAP
is **fully disabled and inactive**.
* mi0-mi2/mo0-mo1 - when the corresponding SVme bit is enabled, these
indicate the SVSHAPE (0-3) that the corresponding register (RA etc)
should use, as long as the register's corresponding SVme bit is set
Programmer's Note: when REMAP is activated it becomes necessary on any
context-switch (Interrupt or Function call) to detect (or know in advance)
that REMAP is enabled and to additionally save/restore the four SVSHAPE
SPRs, SVHAPE0-3. Given that this is expected to be a rare occurrence it was
deemed unreasonable to burden every context-switch or function call with
mandatory save/restore of SVSHAPEs, and consequently it is a *callee*
(and Trap Handler) responsibility. Callees (and Trap Handlers) **MUST**
avoid using all and any SVP64 instructions during the period where state
could be adversely affected. SVP64 purely relies on Scalar instructions,
so Scalar instructions (except the SVP64 Management ones and mtspr and
mfspr) are 100% guaranteed to have zero impact on SVP64 state.
**Max Vector Length (maxvl)**
MAXVECTORLENGTH is the same concept as MVL in RISC-V RVV, except that it
is variable length and may be dynamically set. MVL is limited to 7 bits
(in the first version of SVP64) and consequently the maximum number of
elements is limited to between 0 and 127.
Programmer's Note: Except by directly using `mtspr` on SVSTATE, which may
result in performance penalties on some hardware implementations, SVSTATE's `maxvl`
field may only be set **statically** as an immediate, by the `setvl` instruction.
It may **NOT** be set dynamically from a register. Compiler writers and assembly
programmers are expected to perform static register file analysis, subdivision,
and allocation and only utilise `setvl`. Direct writing to SVSTATE in order to
"bypass" this Note could, in less-advanced implementations, potentially cause stalling.
**Vector Length (vl)**
`setvl` is conceptually similar but different from the Cray, SX Aurora, and RISC-V RVV
equivalent. Similar to RVV, VL is set to be within
the range 0 <= VL <= MVL
VL = rd = MIN(vlen, MVL)
where 0 <= MVL <= XLEN
**SUBVL - Sub Vector Length**
This is a "group by quantity" that effectively asks each iteration
of the hardware loop to load SUBVL elements of width elwidth at a
time. Effectively, SUBVL is like a SIMD multiplier: instead of just 1
operation issued, SUBVL operations are issued.
The main effect of SUBVL is that predication bits are applied per
**group**, rather than by individual element. Legal values are 0 to 3,
representing 1 operation thru 4 operations respectively.
**Horizontal Parallelism**
A problem exists for hardware where it may not be able to detect
that a programmer (or compiler) knows of opportunities for parallelism
and lack of overlap between loops.
For hphint, the number chosen must be consistently
executed **every time**. Hardware is not permitted to execute five
computations for one instruction then three on the next.
hphint is a hint from the compiler to hardware that exactly this
many elements may be safely executed in parallel, without hazards
(including Memory accesses).
Interestingly, when hphint is set equal to VL, it is in effect
as if Vertical First mode were not set, because the hardware is
given the option to run through all elements in an instruction.
This is exactly what Horizontal-First is: a for-loop from 0 to VL-1
except that the hardware may *choose* the number of elements.
*Note to programmers: changing VL during the middle of such modes
should be done only with due care and respect for the fact that SVSTATE
has exactly the same peer-level status as a Program Counter.*
-------------
\newpage{}
# SVL-Form
Add the following to Book I, 1.6.1, SVL-Form
```
|0 |6 |11 |16 |23 |24 |25 |26 |31 |
| PO | RT | RA | SVi |ms |vs |vf | XO |Rc |
| PO | RT | / | SVi |/ |/ |vf | XO |Rc |
```
* Add `SVL` to `RA (11:15)` Field in Book I, 1.6.2
* Add `SVL` to `RT (6:10)` Field in Book I, 1.6.2
* Add `SVL` to `Rc (31)` Field in Book I, 1.6.2
* Add `SVL` to `XO (26:31)` Field in Book I, 1.6.2
Add the following to Book I, 1.6.2
```
ms (23)
Field used in Simple-V to specify whether MVL (maxvl in the SVSTATE SPR)
is to be set
Formats: SVL
vf (25)
Field used in Simple-V to specify whether "Vertical" Mode is set
(vfirst in the SVSTATE SPR)
Formats: SVL
vs (24)
Field used in Simple-V to specify whether VL (vl in the SVSTATE SPR) is to be set
Formats: SVL
SVi (16:22)
Simple-V immediate field for setting VL or MVL (vl, maxvl in the SVSTATE SPR)
Formats: SVL
```
# Appendices
Appendix E Power ISA sorted by opcode
Appendix F Power ISA sorted by version
Appendix G Power ISA sorted by Compliancy Subset
Appendix H Power ISA sorted by mnemonic
| Form | Book | Page | Version | mnemonic | Description |
|------|------|------|---------|----------|-------------|
| SVL | I | # | 3.0B | svstep | Vertical-First Stepping and status reporting |