add lkcl's justification for fmin/fmax
[libreriscv.git] / openpower / sv / rfc / ls013.mdwn
1 # RFC ls013 Min/Max GPR/FPR
2
3 **URLs**:
4
5 * <https://libre-soc.org/openpower/sv/rfc/ls013/>
6 * <https://git.openpower.foundation/isa/PowerISA/issues/TODO>
7 * <https://bugs.libre-soc.org/show_bug.cgi?id=1057>
8
9 **Severity**: Major
10
11 **Status**: New
12
13 **Date**: 14 Apr 2023
14
15 **Target**: v3.2B
16
17 **Source**: v3.1B
18
19 **Books and Section affected**:
20
21 ```
22 Book I Fixed-Point Instructions
23 Appendix E Power ISA sorted by opcode
24 Appendix F Power ISA sorted by version
25 Appendix G Power ISA sorted by Compliancy Subset
26 Appendix H Power ISA sorted by mnemonic
27 ```
28
29 **Summary**
30
31 ```
32 Instructions added
33 ```
34
35 **Submitter**: Luke Leighton (Libre-SOC)
36
37 **Requester**: Libre-SOC
38
39 **Impact on processor**:
40
41 ```
42 Addition of new GPR-based instructions
43 ```
44
45 **Impact on software**:
46
47 ```
48 Requires support for new instructions in assembler, debuggers,
49 and related tools.
50 ```
51
52 **Keywords**:
53
54 ```
55 GPR, FPR, min, max, fmin, fmax
56 ```
57
58 **Motivation**
59
60 TODO
61
62 **Notes and Observations**:
63
64 1. minimum/maximum instructions are needed for vector reductions, where the
65 SVP64 tree reduction needs a single instruction to work properly.
66 2. if you implement any of the FP min/max modes, the rest are not much more
67 hardware.
68 3. TODO(lkcl): fill out: that using VSX may have different meaning (SVP64/VSX)
69 so it is *really* crucial to have SVP64/SFFS ops.
70 4. FP min/max are rather complex to implement in software, the most commonly
71 used FP max function `fmax` from glibc compiled for SFFS is 32 (!)
72 instructions.
73
74 https://gcc.godbolt.org/z/6xba61To6
75
76 ```
77 fmax(double, double):
78 fcmpu 0,1,2
79 fmr 0,1
80 cror 30,1,2
81 beq 7,.L12
82 blt 0,.L13
83 stfd 1,-16(1)
84 lis 9,0x8
85 li 8,-1
86 sldi 9,9,32
87 rldicr 8,8,0,11
88 ori 2,2,0
89 ld 10,-16(1)
90 xor 10,10,9
91 sldi 10,10,1
92 cmpld 0,10,8
93 bgt 0,.L5
94 stfd 2,-16(1)
95 ori 2,2,0
96 ld 10,-16(1)
97 xor 9,10,9
98 sldi 9,9,1
99 cmpld 0,9,8
100 ble 0,.L6
101 .L5:
102 fadd 1,0,2
103 blr
104 .L13:
105 fmr 1,2
106 blr
107 .L6:
108 fcmpu 0,2,2
109 fmr 1,2
110 bnulr 0
111 .L12:
112 fmr 1,0
113 blr
114 .long 0
115 .byte 0,9,0,0,0,0,0,0
116 ```
117
118 **Changes**
119
120 Add the following entries to:
121
122 * the Appendices of Book I
123 * Book I 3.3.9 Fixed-Point Arithmetic Instructions
124 * Book I 4.6.6.1 Floating-Point Elementary Arithmetic Instructions
125 * Book I 1.6.1 and 1.6.2
126
127 ----------------
128
129 \newpage{}
130
131 ## `FMM` -- Floating Min/Max Mode
132
133 <a id="fmm-floating-min-max-mode"></a>
134
135 | `FMM` | Assembly Alias | Origin | Semantics |
136 |-------|-------------------------------|--------------------------------|-------------------------------------------------|
137 | 0000 | fminnum08[s] FRT, FRA, FRB | IEEE 754-2008 | FRT = minNum(FRA, FRB) (1) |
138 | 0001 | fmin19[s] FRT, FRA, FRB | IEEE 754-2019 | FRT = minimum(FRA, FRB) |
139 | 0010 | fminnum19[s] FRT, FRA, FRB | IEEE 754-2019 | FRT = minimumNumber(FRA, FRB) |
140 | 0011 | fminc[s] FRT, FRA, FRB | x86 minss or Win32's min macro | FRT = FRA \< FRB ? FRA : FRB |
141 | 0100 | fminmagnum08[s] FRT, FRA, FRB | IEEE 754-2008 (TODO: (3)) | FRT = minmaxmag(FRA, FRB, False, fminnum08) (2) |
142 | 0101 | fminmag19[s] FRT, FRA, FRB | IEEE 754-2019 | FRT = minmaxmag(FRA, FRB, False, fmin19) (2) |
143 | 0110 | fminmagnum19[s] FRT, FRA, FRB | IEEE 754-2019 | FRT = minmaxmag(FRA, FRB, False, fminnum19) (2) |
144 | 0111 | fminmagc[s] FRT, FRA, FRB | - | FRT = minmaxmag(FRA, FRB, False, fminc) (2) |
145 | 1000 | fmaxnum08[s] FRT, FRA, FRB | IEEE 754-2008 | FRT = maxNum(FRA, FRB) (1) |
146 | 1001 | fmax19[s] FRT, FRA, FRB | IEEE 754-2019 | FRT = maximum(FRA, FRB) |
147 | 1010 | fmaxnum19[s] FRT, FRA, FRB | IEEE 754-2019 | FRT = maximumNumber(FRA, FRB) |
148 | 1011 | fmaxc[s] FRT, FRA, FRB | x86 maxss or Win32's max macro | FRT = FRA > FRB ? FRA : FRB |
149 | 1100 | fmaxmagnum08[s] FRT, FRA, FRB | IEEE 754-2008 (TODO: (3)) | FRT = minmaxmag(FRA, FRB, True, fmaxnum08) (2) |
150 | 1101 | fmaxmag19[s] FRT, FRA, FRB | IEEE 754-2019 | FRT = minmaxmag(FRA, FRB, True, fmax19) (2) |
151 | 1110 | fmaxmagnum19[s] FRT, FRA, FRB | IEEE 754-2019 | FRT = minmaxmag(FRA, FRB, True, fmaxnum19) (2) |
152 | 1111 | fmaxmagc[s] FRT, FRA, FRB | - | FRT = minmaxmag(FRA, FRB, True, fmaxc) (2) |
153
154 Note (1): for the purposes of minNum/maxNum, -0.0 is defined to be less than
155 +0.0. This is left unspecified in IEEE 754-2008.
156
157 Note (2): minmaxmag(x, y, cmp, fallback) is defined as:
158
159 ```python
160 def minmaxmag(x, y, is_max, fallback):
161 a = abs(x) < abs(y)
162 b = abs(x) > abs(y)
163 if is_max:
164 a, b = b, a # swap
165 if a:
166 return x
167 if b:
168 return y
169 # equal magnitudes, or NaN input(s)
170 return fallback(x, y)
171 ```
172
173 Note (3): TODO: icr if IEEE 754-2008 has min/maxMagNum like IEEE 754-2019's
174 minimum/maximumMagnitudeNumber
175
176 ----------------
177
178 \newpage{}
179
180 ## Floating Minimum/Maximum X-Form
181
182 ```
183 fminmax FRT, FRA, FRB, FMM
184 ```
185
186 ```
187 |0 |6 |11 |16 |21 |24 |31 |
188 | PO | FRT | FRA | FRB | FMM[0:2] | XO | FMM[3] |
189 ```
190
191 Compute the minimum/maximum of FRA and FRB, according to FMM, and store the
192 result in FRT.
193
194 Assembly Aliases: see
195 [`FMM` -- Floating Min/Max Mode](#fmm-floating-min-max-mode)
196
197 ----------
198
199 ## Floating Minimum/Maximum Single X-Form
200
201 ```
202 fminmaxs FRT, FRA, FRB, FMM
203 ```
204
205 ```
206 |0 |6 |11 |16 |21 |24 |31 |
207 | PO | FRT | FRA | FRB | FMM[0:2] | XO | FMM[3] |
208 ```
209
210 Compute the minimum/maximum of FRA and FRB, according to FMM, and store the
211 result in FRT.
212
213 Assembly Aliases: see
214 [`FMM` -- Floating Min/Max Mode](#fmm-floating-min-max-mode)
215
216 ----------
217
218 \newpage{}
219
220 ## Minimum Unsigned X-Form
221
222 ```
223 minu RT, RA, RB
224 minu. RT, RA, RB
225 ```
226
227 ```
228 |0 |6 |11 |16 |21 |31 |
229 | PO | RT | RA | RB | XO | Rc |
230 ```
231
232 ```
233 if (RA) <u (RB) then
234 RT <- (RA)
235 else
236 RT <- (RB)
237 ```
238
239 Compute the unsigned minimum of RA and RB and store the result in RT.
240
241 Special Registers altered:
242
243 ```
244 CR0 (if Rc=1)
245 ```
246
247 ----------
248
249 ## Maximum Unsigned X-Form
250
251 ```
252 maxu RT, RA, RB
253 maxu. RT, RA, RB
254 ```
255
256 ```
257 |0 |6 |11 |16 |21 |31 |
258 | PO | RT | RA | RB | XO | Rc |
259 ```
260
261 ```
262 if (RA) >u (RB) then
263 RT <- (RA)
264 else
265 RT <- (RB)
266 ```
267
268 Compute the unsigned maximum of RA and RB and store the result in RT.
269
270 Special Registers altered:
271
272 ```
273 CR0 (if Rc=1)
274 ```
275
276 ----------
277
278 \newpage{}
279
280 ## Minimum X-Form
281
282 ```
283 min RT, RA, RB
284 min. RT, RA, RB
285 ```
286
287 ```
288 |0 |6 |11 |16 |21 |31 |
289 | PO | RT | RA | RB | XO | Rc |
290 ```
291
292 ```
293 if (RA) < (RB) then
294 RT <- (RA)
295 else
296 RT <- (RB)
297 ```
298
299 Compute the signed minimum of RA and RB and store the result in RT.
300
301 Special Registers altered:
302
303 ```
304 CR0 (if Rc=1)
305 ```
306
307 ----------
308
309 ## Maximum X-Form
310
311 ```
312 max RT, RA, RB
313 max. RT, RA, RB
314 ```
315
316 ```
317 |0 |6 |11 |16 |21 |31 |
318 | PO | RT | RA | RB | XO | Rc |
319 ```
320
321 ```
322 if (RA) > (RB) then
323 RT <- (RA)
324 else
325 RT <- (RB)
326 ```
327
328 Compute the signed maximum of RA and RB and store the result in RT.
329
330 Special Registers altered:
331
332 ```
333 CR0 (if Rc=1)
334 ```
335
336 ----------
337
338 \newpage{}
339
340 # Instruction Formats
341
342 Add the following entries to Book I 1.6.1.15 X-FORM:
343
344 ```
345 |0 |6 |11 |16 |21 |24 |31 |
346 | PO | FRT | FRA | FRB | FMM[0:2] | XO | FMM[3] |
347 ```
348
349 Add a new field to Book I 1.6.2 Word Instruction Fields:
350
351 ```
352 FMM (21:23,31)
353 Field used to specify minimum/maximum mode for fminmax[s].
354
355 Formats: X
356 ```
357
358 ----------
359
360 \newpage{}
361
362 # Appendices
363
364 Appendix E Power ISA sorted by opcode
365 Appendix F Power ISA sorted by version
366 Appendix G Power ISA sorted by Compliancy Subset
367 Appendix H Power ISA sorted by mnemonic
368
369 | Form | Book | Page | Version | mnemonic | Description |
370 |------|------|------|---------|----------|-------------|
371 | X | I | # | 3.2B | fminmax | Floating Minimum/Maximum |
372 | X | I | # | 3.2B | fminmaxs | Floating Minimum/Maximum Single |
373 | X | I | # | 3.2B | minu | Minimum Unsigned |
374 | X | I | # | 3.2B | maxu | Maximum Unsigned |
375 | X | I | # | 3.2B | min | Minimum |
376 | X | I | # | 3.2B | max | Maximum |
377
378 [[!tag opf_rfc]]
379