(no commit message)
[libreriscv.git] / simple_v_extension / specification / bitmanip.mdwn
1 **OBSOLETE SPEC** see [[openpower/sv/bitmanip]]
2
3 # Bitmanip opcodes
4
5 These are bit manipulation opcodes that, if provided, augment SimpleV for
6 the purposes of efficiently accelerating Vector Processing, 3D Graphics
7 and Video Processing.
8
9 The justification for their inclusion in BitManip is identical to the
10 significant justification that went into their inclusion in the
11 RISC-V Vector Extension (under the "Predicate Mask" opcodes section)
12
13 See
14 <https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#vector-mask-instructions>
15 for details.
16
17 # Predicate Masks
18
19 SV uses standard integer scalar registers as a predicate bitmask. Therefore,
20 the majority of RISC-V RV32I / RV64I bit-level instructions are perfectly
21 adequate. Some exceptions however present themselves from RVV.
22
23 ## logical bit-wise instructions
24
25 These are the available bitwise instructions in RVV:
26
27 vmand.mm vd, vs2, vs1 # vd[i] = vs2[i].LSB && vs1[i].LSB
28 vmnand.mm vd, vs2, vs1 # vd[i] = !(vs2[i].LSB && vs1[i].LSB)
29 vmandnot.mm vd, vs2, vs1 # vd[i] = vs2[i].LSB && !vs1[i].LSB
30 vmxor.mm vd, vs2, vs1 # vd[i] = vs2[i].LSB ^^ vs1[i].LSB
31 vmor.mm vd, vs2, vs1 # vd[i] = vs2[i].LSB || vs1[i].LSB
32 vmnor.mm vd, vs2, vs1 # vd[i] = !(vs2[i[.LSB || vs1[i].LSB)
33 vmornot.mm vd, vs2, vs1 # vd[i] = vs2[i].LSB || !vs1[i].LSB
34 vmxnor.mm vd, vs2, vs1 # vd[i] = !(vs2[i].LSB ^^ vs1[i].LSB)
35
36 The ones that exist in scalar RISC-V are:
37
38 AND rd, rs1, rs2 # rd = rs1 & rs2
39 OR rd, rs1, rs2 # rd = rs1 | rs2
40 XOR rd, rs1, rs2 # rd = rs1 ^ rs2
41
42 The ones in Bitmanip are:
43
44 ANDN rd, rs1, rs2 # rd = rs1 & ~rs2
45 ORN rd, rs1, rs2 # rd = rs1 | ~rs2
46 XORN rd, rs1, rs2 # rd = rs1 ^ ~rs2
47
48 This leaves:
49
50 NOR
51 NAND
52
53 These are currently listed as "pseudo-ops" in BitManip-Draft (0.91)
54 They need to be actual opcodes.
55
56
57 TODO: there is an extensive table in RVV of bit-level operations:
58
59 output instruction pseudoinstruction
60
61 | 0 | 1 | 2 | 3 | instruction | pseudoinstruction |
62 | - | - | - | - | -------------------------- | ----------------- |
63 | 0 | 0 | 0 | 0 | vmxor.mm vd, vd, vd | vmclr.m vd |
64 | 1 | 0 | 0 | 0 | vmnor.mm vd, src1, src2 | |
65 | 0 | 1 | 0 | 0 | vmandnot.mm vd, src2, src1 | |
66 | 1 | 1 | 0 | 0 | vmnand.mm vd, src1, src1 | vmnot.m vd, src1 |
67 | 0 | 0 | 1 | 0 | vmandnot.mm vd, src1, src2 | |
68 | 1 | 0 | 1 | 0 | vmnand.mm vd, src2, src2 | vmnot.m vd, src2 |
69 | 0 | 1 | 1 | 0 | vmxor.mm vd, src1, src2 | |
70 | 1 | 1 | 1 | 0 | vmnand.mm vd, src1, src2 | |
71 | 0 | 0 | 0 | 1 | vmand.mm vd, src1, src2 | |
72 | 1 | 0 | 0 | 1 | vmxnor.mm vd, src1, src2 | |
73 | 0 | 1 | 0 | 1 | vmand.mm vd, src2, src2 | vmcpy.m vd, src2 |
74 | 1 | 1 | 0 | 1 | vmornot.mm vd, src2, src1 | |
75 | 0 | 0 | 1 | 1 | vmand.mm vd, src1, src1 | vmcpy.m vd, src1 |
76 | 1 | 0 | 1 | 1 | vmornot.mm vd, src1, src2 | |
77 | 1 | 1 | 1 | 1 | vmxnor.mm vd, vd, vd | vmset.m vd |
78
79 ## pcnt - population count
80
81 population-count.
82
83 Pseudocode:
84
85 unsigned int v; // count the number of bits set in v
86 unsigned int c; // c accumulates the total bits set in v
87 for (c = 0; v; c++)
88 {
89 v &= v - 1; // clear the least significant bit set
90 }
91
92 This instruction is present in BitManip.
93
94 ## ffirst - find first bit
95
96 finds the first bit set as an index.
97
98 Pseudocode:
99
100
101 uint_xlen_t clz(uint_xlen_t rs1)
102 {
103 for (int count = 0; count < XLEN; count++)
104 if ((rs1 << count) >> (XLEN - 1))
105 return count;
106 return XLEN; // -1
107 }
108
109 This is similar but not identical to BitManip "CLZ". CLZ returns XLEN when no bits are set, whereas RVV returns -1.
110
111 ## sbf - set before first bit
112
113 Sets all LSBs leading up to (excluding) where an LSB in the src is set,
114 and sets zeros including and following the src bit found.
115 If the second operand is non-zero, this process continues the search
116 (in the same LSB to MSB order) beginning each time (including the first time)
117 from where 1s are set in the second operand.
118
119 A side-effect of the search is that when src is zero, the output is all ones.
120 If the second operand is non-zero and the src is zero, the output is a
121 copy of the second operand.
122
123 # Example
124
125 7 6 5 4 3 2 1 0 Bit number
126
127 1 0 0 1 0 1 0 0 a3 contents
128 sbf a2, a3, x0
129 0 0 0 0 0 0 1 1 a2 contents
130
131 1 0 0 1 0 1 0 1 a3 contents
132 sbf a2, a3, x0
133 0 0 0 0 0 0 0 0 a2
134
135 0 0 0 0 0 0 0 0 a3 contents
136 sbf a2, a3, x0
137 1 1 1 1 1 1 1 1 a2
138
139 1 1 0 0 0 0 1 1 a0 vcontents
140 1 0 0 1 0 1 0 0 a3 contents
141 sbf a2, a3, a0
142 0 1 0 0 0 0 1 1 a2 contents
143
144 Pseudo-code:
145
146 def sof(rd, rs1, rs2):
147 rd = 0
148 setting_mode = rs2 == x0 or (regs[rs2] & 1)
149
150 while i < XLEN:
151 bit = 1<<i
152
153 # only reenable when predicate in use, and bit valid
154 if !setting_mode && rs2 != x0:
155 if (regs[rs2] & bit):
156 # back into "setting" mode
157 setting_mode = True
158
159 # skipping mode
160 if !setting_mode:
161 # skip any more 1s
162 if regs[rs1] & bit == 1:
163 i += 1
164 continue
165
166 # setting mode, search for 1
167 if regs[rs1] & bit: # found a bit in rs1:
168 setting_mode = False
169 # next loop starts skipping
170 else:
171 regs[rd] |= bit # always set except when search succeeds
172
173 i += 1
174
175 def sbf(rd, rs1, rs2):
176 rd = 0
177 # start setting if no predicate or if 1st predicate bit set
178 setting_mode = rs2 == x0 or (regs[rs2] & 1)
179 while i < XLEN:
180 bit = 1<<i
181 if rs2 != x0 and (regs[rs2] & bit):
182 # reset searching
183 setting_mode = False
184 if setting_mode:
185 if regs[rs1] & bit: # found a bit in rs1: stop setting rd
186 setting_mode = False
187 else:
188 regs[rd] |= bit
189 else if rs2 != x0: # searching mode
190 if (regs[rs2] & bit):
191 setting_mode = True # back into "setting" mode
192 i += 1
193
194 ## sif - set including first bit
195
196 Similar to sbf except including the bit which ends a run. i.e:
197 Sets all LSBs leading up to *and including* where an LSB in the src is set,
198 and sets zeros following the point where the src bit is found.
199
200 The side-effect of when the src is zero is also the same as for sbf:
201 output is all 1s if src2 is zero, and output is equal to src2 if src2
202 is non-zero.
203
204
205 # Example
206
207 7 6 5 4 3 2 1 0 Element number
208
209 1 0 0 1 0 1 0 0 a3 contents
210 sif a2, a3
211 0 0 0 0 0 1 1 1 a2 contents
212
213 1 0 0 1 0 1 0 1 a3 contents
214 sif a2, a3
215 0 0 0 0 0 0 0 1 a2
216
217 1 1 0 0 0 0 1 1 a0 vcontents
218 1 0 0 1 0 1 0 0 a3 contents
219 sif a2, a3, a0
220 1 1 x x x x 1 1 a2 contents
221
222 Pseudo-code:
223
224 def sif(rd, rs1, rs2):
225 rd = 0
226 setting_mode = rs2 == x0 or (regs[rs2] & 1)
227
228 while i < XLEN:
229 bit = 1<<i
230
231 # only reenable when predicate in use, and bit valid
232 if !setting_mode && rs2 != x0:
233 if (regs[rs2] & bit):
234 # back into "setting" mode
235 setting_mode = True
236
237 # skipping mode
238 if !setting_mode:
239 # skip any more 1s
240 if regs[rs1] & bit == 1:
241 i += 1
242 continue
243
244 # setting mode, search for 1
245 regs[rd] |= bit # always set during search
246 if regs[rs1] & bit: # found a bit in rs1:
247 setting_mode = False
248 # next loop starts skipping
249
250 i += 1
251
252 ## sof - set only first bit
253
254 Similar to sbf and sif except *only* set the bit which ends a run.
255
256 Unlike sbf and sif however, if the src is zero then the output is
257 also guaranteed to be zero, irrespective of src2's contents.
258
259 # Example
260
261 7 6 5 4 3 2 1 0 Element number
262
263 1 0 0 1 0 1 0 0 a3 contents
264 sof a2, a3
265 0 0 0 0 0 1 0 0 a2 contents
266
267 1 0 0 1 0 1 0 1 a3 contents
268 sof a2, a3
269 0 0 0 0 0 0 0 1 a2
270
271 1 1 0 0 0 0 1 1 a0 vcontents
272 1 1 0 1 0 1 0 0 a3 contents
273 sof a2, a3, a0
274 0 1 x x x x 0 0 a2 contents
275
276 Pseudo-code:
277
278 def sof(rd, rs1, rs2):
279 rd = 0
280 setting_mode = rs2 == x0 or (regs[rs2] & 1)
281
282 while i < XLEN:
283 bit = 1<<i
284
285 # only reenable when predicate in use, and bit valid
286 if !setting_mode && rs2 != x0:
287 if (regs[rs2] & bit):
288 # back into "setting" mode
289 setting_mode = True
290
291 # skipping mode
292 if !setting_mode:
293 # skip any more 1s
294 if regs[rs1] & bit == 1:
295 i += 1
296 continue
297
298 # setting mode, search for 1
299 if regs[rs1] & bit: # found a bit in rs1:
300 regs[rd] |= bit # only set when search succeeds
301 setting_mode = False
302 # next loop starts skipping
303
304 i += 1
305