1 # setvl: Set Vector Length
5 * <http://lists.libre-soc.org/pipermail/libre-soc-dev/2020-November/001366.html>
6 * <https://bugs.libre-soc.org/show_bug.cgi?id=535>
7 * <https://bugs.libre-soc.org/show_bug.cgi?id=587>
8 * <https://bugs.libre-soc.org/show_bug.cgi?id=568> TODO
9 * <https://bugs.libre-soc.org/show_bug.cgi?id=927> bug - RT>=32
10 * <https://bugs.libre-soc.org/show_bug.cgi?id=862> VF Predication
11 * <https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#vsetvlivsetvl-instructions>
13 * pseudocode [[openpower/isa/simplev]]
15 Add the following section to the Simple-V Chapter
21 | 0-5|6-10|11-15|16-22 | 23 24 25 | 26-30 |31| FORM |
22 | -- | -- | --- | ---- |----------| ----- |--|----------|
23 |PO | RT | RA | SVi | ms vs vf | XO |Rc| SVL-Form |
25 * setvl RT,RA,SVi,vf,vs,ms (Rc=0)
26 * setvl. RT,RA,SVi,vf,vs,ms (Rc=1)
31 overflow <- 0b0 # sets CR.SO if set and if Rc=1
34 if ms = 1 then MVL <- VLimm[0:6]
35 else MVL <- SVSTATE[0:6]
37 if vs = 0 then VL <- SVSTATE[7:13]
39 if (RA) >u 0b1111111 then
42 else VL <- (RA)[57:63]
43 else if _RT = 0 then VL <- VLimm[0:6]
44 else if CTR >u 0b1111111 then
48 # limit VL to within MVL
55 GPR(_RT) <- [0]*57 || VL
56 # MAXVL is a static "state-reset" opportunity so VF is only set then.
58 SVSTATE[63] <- vf # set Vertical-First mode
59 SVSTATE[62] <- 0b0 # clear persist bit
62 Special Registers Altered:
69 * `SVi` - bits 16-22 - an immediate operand for setting MVL and/or VL
70 * `ms` - bit 23 - allows for setting of MVL
71 * `vs` - bit 24 - allows for setting of VL
72 * `vf` - bit 25 - sets "Vertical First Mode".
74 Note that in immediate setting mode VL and MVL start from **one**
75 but that this is compensated for in the assembly notation.
76 i.e. that an immediate value of 1 in assembler notation
77 actually places the value 0b0000000 in the `SVi` field bits:
78 on execution the `setvl` instruction adds one to the decoded
79 `SVi` field bits, resulting in
80 VL/MVL being set to 1. This allows VL to be set to values
81 ranging from 1 to 128 with only 7 bits instead of 8.
83 to 0 would result in all Vector operations becoming `nop`. If this is
84 truly desired (nop behaviour) then setting VL and MVL to zero is to be
85 done via the [[SVSTATE SPR|sv/sprs]].
87 Note that setmvli is a pseudo-op, based on RA/RT=0, and setvli likewise
89 setvli VL=8 : setvl r0, r0, VL=8, vf=0, vs=1, ms=0
90 setvli. VL=8 : setvl. r0, r0, VL=8, vf=0, vs=1, ms=0
91 setmvli MVL=8 : setvl r0, r0, MVL=8, vf=0, vs=0, ms=1
92 setmvli. MVL=8 : setvl. r0, r0, MVL=8, vf=0, vs=0, ms=1
94 Additional pseudo-op for obtaining VL without modifying it (or any state):
96 getvl r5 : setvl r5, r0, vf=0, vs=0, ms=0
97 getvl. r5 : setvl. r5, r0, vf=0, vs=0, ms=0
99 Note that whilst it is possible to set both MVL and VL from the same
100 immediate, it is not possible to set them to different immediates in
101 the same instruction. Doing so would require two instructions.
103 Use of setvl results in changes to the SVSTATE SPR. see [[sv/sprs]]
105 **Selecting sources for VL**
107 There is considerable opcode pressure, consequently to set MVL and VL
108 from different sources is as follows:
110 | condition | effect |
112 | `vs=1, RA=0, RT!=0` | VL,RT set to MIN(MVL, CTR) |
113 | `vs=1, RA=0, RT=0` | VL set to MIN(MVL, SVi+1) |
114 | `vs=1, RA!=0, RT=0` | VL set to MIN(MVL, RA) |
115 | `vs=1, RA!=0, RT!=0` | VL,RT set to MIN(MVL, RA) |
117 The reasoning here is that the opportunity to set RT equal to the
118 immediate `SVi+1` is sacrificed in favour of setting from CTR.
120 ## Unusual Rc=1 behaviour
122 Normally, the return result from an instruction is in `RT`. With
123 it being possible for `RT=0` to mean that `CTR` mode is to be read,
124 some different semantics are needed.
126 CR Field 0, when `Rc=1`, may be set even if `RT=0`. The reason is that
127 overflow may occur: `VL`, if set either from an immediate or from `CTR`,
128 may not exceed `MAXVL`, and if it is, `CR0.SO` must be set.
130 In reality it is **`VL`** being set. Therefore, rather
131 than `CR0` testing `RT` when `Rc=1`, CR0.EQ is set if `VL=0`, CR0.GE
132 is set if `VL` is non-zero.
136 Sub-vector elements are not be considered "Vertical". The vec2/3/4
137 is to be considered as if the "single element". Caveats exist for
138 [[sv/mv.swizzle]] and [[sv/mv.vec]] when Pack/Unpack is enabled,
139 due to the order in which VL and SUBVL loops are applied being
140 swapped (outer-inner becomes inner-outer)
144 ### Core concept loop
148 setvl a3, a0, MVL=8 # update a3 with vl
149 # (# of elements this iteration)
151 # do vector operations at up to 8 length (MVL=8)
153 sub a0, a0, a3 # Decrement count by vl
154 bnez a0, loop # Any more?
166 setvli. r4, r3, MVL=64
171 ### Load/Store-Multi (selective)
173 Up to 64 FPRs will be loaded, here. `r3` is set one per bit
174 for each FP register required to be loaded. The block of memory
175 from which the registers are loaded is contiguous (no gaps):
176 any FP register which has a corresponding zero bit in `r3`
177 is *unaltered*. In essence this is a selective LD-multi with
178 "Scatter" capability.
180 setvli r0, MVL=64, VL=64
181 sv.fld/dm=r3 *r0, 0(r30) # selective load 64 FP registers
183 Up to 64 FPRs will be saved, here. Again, `r3`
185 setvli r0, MVL=64, VL=64
186 sv.stfd/sm=r3 *fp0, 0(r30) # selective store 64 FP registers