openpower/sv/rfc/ls013.mdwn

   1 # RFC ls013 Min/Max GPR/FPR
   2
   3 **URLs**:
   4
   5 * <https://libre-soc.org/openpower/sv/rfc/ls013/>
   6 * <https://git.openpower.foundation/isa/PowerISA/issues/TODO>
   7 * <https://bugs.libre-soc.org/show_bug.cgi?id=1057>
   8
   9 **Severity**: Major
  10
  11 **Status**: New
  12
  13 **Date**: 14 Apr 2023
  14
  15 **Target**: v3.2B
  16
  17 **Source**: v3.1B
  18
  19 **Books and Section affected**:
  20
  21 ```
  22     Book I Fixed-Point Instructions
  23     Appendix E Power ISA sorted by opcode
  24     Appendix F Power ISA sorted by version
  25     Appendix G Power ISA sorted by Compliancy Subset
  26     Appendix H Power ISA sorted by mnemonic
  27 ```
  28
  29 **Summary**
  30
  31 ```
  32     Instructions added
  33 ```
  34
  35 **Submitter**: Luke Leighton (Libre-SOC)
  36
  37 **Requester**: Libre-SOC
  38
  39 **Impact on processor**:
  40
  41 ```
  42     Addition of new GPR-based instructions
  43 ```
  44
  45 **Impact on software**:
  46
  47 ```
  48     Requires support for new instructions in assembler, debuggers,
  49     and related tools.
  50 ```
  51
  52 **Keywords**:
  53
  54 ```
  55     GPR, FPR, min, max, fmin, fmax
  56 ```
  57
  58 **Motivation**
  59
  60 TODO
  61
  62 **Notes and Observations**:
  63
  64 1. minimum/maximum instructions are needed for vector reductions, where the
  65     SVP64 tree reduction needs a single instruction to work properly.
  66 2. if you implement any of the FP min/max modes, the rest are not much more
  67     hardware.
  68 3. TODO(lkcl): fill out: that using VSX may have different meaning (SVP64/VSX)
  69     so it is *really* crucial to have SVP64/SFFS ops.
  70 4. FP min/max are rather complex to implement in software, the most commonly
  71     used FP max function `fmax` from glibc compiled for SFFS is 32 (!)
  72     instructions.
  73
  74 https://gcc.godbolt.org/z/6xba61To6
  75
  76 ```
  77     fmax(double, double):
  78         fcmpu 0,1,2
  79         fmr 0,1
  80         cror 30,1,2
  81         beq 7,.L12
  82         blt 0,.L13
  83         stfd 1,-16(1)
  84         lis 9,0x8
  85         li 8,-1
  86         sldi 9,9,32
  87         rldicr 8,8,0,11
  88         ori 2,2,0
  89         ld 10,-16(1)
  90         xor 10,10,9
  91         sldi 10,10,1
  92         cmpld 0,10,8
  93         bgt 0,.L5
  94         stfd 2,-16(1)
  95         ori 2,2,0
  96         ld 10,-16(1)
  97         xor 9,10,9
  98         sldi 9,9,1
  99         cmpld 0,9,8
 100         ble 0,.L6
 101 .L5:
 102         fadd 1,0,2
 103         blr
 104 .L13:
 105         fmr 1,2
 106         blr
 107 .L6:
 108         fcmpu 0,2,2
 109         fmr 1,2
 110         bnulr 0
 111 .L12:
 112         fmr 1,0
 113         blr
 114         .long 0
 115         .byte 0,9,0,0,0,0,0,0
 116 ```
 117
 118 **Changes**
 119
 120 Add the following entries to:
 121
 122 * the Appendices of Book I
 123 * Book I 3.3.9 Fixed-Point Arithmetic Instructions
 124 * Book I 4.6.6.1 Floating-Point Elementary Arithmetic Instructions
 125 * Book I 1.6.1 and 1.6.2
 126
 127 ----------------
 128
 129 \newpage{}
 130
 131 ## `FMM` -- Floating Min/Max Mode
 132
 133 <a id="fmm-floating-min-max-mode"></a>
 134
 135 | `FMM` | Assembly Alias                | Origin                         | Semantics                                       |
 136 |-------|-------------------------------|--------------------------------|-------------------------------------------------|
 137 | 0000  | fminnum08[s] FRT, FRA, FRB    | IEEE 754-2008                  | FRT = minNum(FRA, FRB)  (1)                     |
 138 | 0001  | fmin19[s] FRT, FRA, FRB       | IEEE 754-2019                  | FRT = minimum(FRA, FRB)                         |
 139 | 0010  | fminnum19[s] FRT, FRA, FRB    | IEEE 754-2019                  | FRT = minimumNumber(FRA, FRB)                   |
 140 | 0011  | fminc[s] FRT, FRA, FRB        | x86 minss or Win32's min macro | FRT = FRA \< FRB ? FRA : FRB                    |
 141 | 0100  | fminmagnum08[s] FRT, FRA, FRB | IEEE 754-2008 (TODO: (3))      | FRT = minmaxmag(FRA, FRB, False, fminnum08) (2) |
 142 | 0101  | fminmag19[s] FRT, FRA, FRB    | IEEE 754-2019                  | FRT = minmaxmag(FRA, FRB, False, fmin19) (2)    |
 143 | 0110  | fminmagnum19[s] FRT, FRA, FRB | IEEE 754-2019                  | FRT = minmaxmag(FRA, FRB, False, fminnum19) (2) |
 144 | 0111  | fminmagc[s] FRT, FRA, FRB     | -                              | FRT = minmaxmag(FRA, FRB, False, fminc) (2)     |
 145 | 1000  | fmaxnum08[s] FRT, FRA, FRB    | IEEE 754-2008                  | FRT = maxNum(FRA, FRB)  (1)                     |
 146 | 1001  | fmax19[s] FRT, FRA, FRB       | IEEE 754-2019                  | FRT = maximum(FRA, FRB)                         |
 147 | 1010  | fmaxnum19[s] FRT, FRA, FRB    | IEEE 754-2019                  | FRT = maximumNumber(FRA, FRB)                   |
 148 | 1011  | fmaxc[s] FRT, FRA, FRB        | x86 maxss or Win32's max macro | FRT = FRA > FRB ? FRA : FRB                     |
 149 | 1100  | fmaxmagnum08[s] FRT, FRA, FRB | IEEE 754-2008 (TODO: (3))      | FRT = minmaxmag(FRA, FRB, True, fmaxnum08) (2)  |
 150 | 1101  | fmaxmag19[s] FRT, FRA, FRB    | IEEE 754-2019                  | FRT = minmaxmag(FRA, FRB, True, fmax19) (2)     |
 151 | 1110  | fmaxmagnum19[s] FRT, FRA, FRB | IEEE 754-2019                  | FRT = minmaxmag(FRA, FRB, True, fmaxnum19) (2)  |
 152 | 1111  | fmaxmagc[s] FRT, FRA, FRB     | -                              | FRT = minmaxmag(FRA, FRB, True, fmaxc) (2)      |
 153
 154 Note (1): for the purposes of minNum/maxNum, -0.0 is defined to be less than
 155     +0.0. This is left unspecified in IEEE 754-2008.
 156
 157 Note (2): minmaxmag(x, y, cmp, fallback) is defined as:
 158
 159 ```python
 160 def minmaxmag(x, y, is_max, fallback):
 161     a = abs(x) < abs(y)
 162     b = abs(x) > abs(y)
 163     if is_max:
 164         a, b = b, a  # swap
 165     if a:
 166         return x
 167     if b:
 168         return y
 169     # equal magnitudes, or NaN input(s)
 170     return fallback(x, y)
 171 ```
 172
 173 Note (3): TODO: icr if IEEE 754-2008 has min/maxMagNum like IEEE 754-2019's
 174     minimum/maximumMagnitudeNumber
 175
 176 ----------------
 177
 178 \newpage{}
 179
 180 ## Floating Minimum/Maximum X-Form
 181
 182 ```
 183     fminmax FRT, FRA, FRB, FMM
 184 ```
 185
 186 ```
 187     |0    |6    |11   |16   |21        |24  |31      |
 188     | PO  | FRT | FRA | FRB | FMM[0:2] | XO | FMM[3] |
 189 ```
 190
 191 Compute the minimum/maximum of FRA and FRB, according to FMM, and store the
 192 result in FRT.
 193
 194 Assembly Aliases: see
 195 [`FMM` -- Floating Min/Max Mode](#fmm-floating-min-max-mode)
 196
 197 ----------
 198
 199 ## Floating Minimum/Maximum Single X-Form
 200
 201 ```
 202     fminmaxs FRT, FRA, FRB, FMM
 203 ```
 204
 205 ```
 206     |0    |6    |11   |16   |21        |24  |31      |
 207     | PO  | FRT | FRA | FRB | FMM[0:2] | XO | FMM[3] |
 208 ```
 209
 210 Compute the minimum/maximum of FRA and FRB, according to FMM, and store the
 211 result in FRT.
 212
 213 Assembly Aliases: see
 214 [`FMM` -- Floating Min/Max Mode](#fmm-floating-min-max-mode)
 215
 216 ----------
 217
 218 \newpage{}
 219
 220 ## Minimum Unsigned X-Form
 221
 222 ```
 223     minu RT, RA, RB
 224     minu. RT, RA, RB
 225 ```
 226
 227 ```
 228     |0   |6   |11  |16   |21  |31  |
 229     | PO | RT | RA | RB  | XO | Rc |
 230 ```
 231
 232 ```
 233     if (RA) <u (RB) then
 234         RT <- (RA)
 235     else
 236         RT <- (RB)
 237 ```
 238
 239 Compute the unsigned minimum of RA and RB and store the result in RT.
 240
 241 Special Registers altered:
 242
 243 ```
 244     CR0     (if Rc=1)
 245 ```
 246
 247 ----------
 248
 249 ## Maximum Unsigned X-Form
 250
 251 ```
 252     maxu RT, RA, RB
 253     maxu. RT, RA, RB
 254 ```
 255
 256 ```
 257     |0   |6   |11  |16   |21  |31  |
 258     | PO | RT | RA | RB  | XO | Rc |
 259 ```
 260
 261 ```
 262     if (RA) >u (RB) then
 263         RT <- (RA)
 264     else
 265         RT <- (RB)
 266 ```
 267
 268 Compute the unsigned maximum of RA and RB and store the result in RT.
 269
 270 Special Registers altered:
 271
 272 ```
 273     CR0     (if Rc=1)
 274 ```
 275
 276 ----------
 277
 278 \newpage{}
 279
 280 ## Minimum X-Form
 281
 282 ```
 283     min RT, RA, RB
 284     min. RT, RA, RB
 285 ```
 286
 287 ```
 288     |0   |6   |11  |16   |21  |31  |
 289     | PO | RT | RA | RB  | XO | Rc |
 290 ```
 291
 292 ```
 293     if (RA) < (RB) then
 294         RT <- (RA)
 295     else
 296         RT <- (RB)
 297 ```
 298
 299 Compute the signed minimum of RA and RB and store the result in RT.
 300
 301 Special Registers altered:
 302
 303 ```
 304     CR0     (if Rc=1)
 305 ```
 306
 307 ----------
 308
 309 ## Maximum X-Form
 310
 311 ```
 312     max RT, RA, RB
 313     max. RT, RA, RB
 314 ```
 315
 316 ```
 317     |0   |6   |11  |16   |21  |31  |
 318     | PO | RT | RA | RB  | XO | Rc |
 319 ```
 320
 321 ```
 322     if (RA) > (RB) then
 323         RT <- (RA)
 324     else
 325         RT <- (RB)
 326 ```
 327
 328 Compute the signed maximum of RA and RB and store the result in RT.
 329
 330 Special Registers altered:
 331
 332 ```
 333     CR0     (if Rc=1)
 334 ```
 335
 336 ----------
 337
 338 \newpage{}
 339
 340 # Instruction Formats
 341
 342 Add the following entries to Book I 1.6.1.15 X-FORM:
 343
 344 ```
 345     |0    |6    |11   |16   |21        |24  |31      |
 346     | PO  | FRT | FRA | FRB | FMM[0:2] | XO | FMM[3] |
 347 ```
 348
 349 Add a new field to Book I 1.6.2 Word Instruction Fields:
 350
 351 ```
 352     FMM (21:23,31)
 353         Field used to specify minimum/maximum mode for fminmax[s].
 354
 355         Formats: X
 356 ```
 357
 358 ----------
 359
 360 \newpage{}
 361
 362 # Appendices
 363
 364     Appendix E Power ISA sorted by opcode
 365     Appendix F Power ISA sorted by version
 366     Appendix G Power ISA sorted by Compliancy Subset
 367     Appendix H Power ISA sorted by mnemonic
 368
 369 | Form | Book | Page | Version | mnemonic | Description |
 370 |------|------|------|---------|----------|-------------|
 371 | X    | I    | #    | 3.2B    | fminmax  | Floating Minimum/Maximum |
 372 | X    | I    | #    | 3.2B    | fminmaxs | Floating Minimum/Maximum Single |
 373 | X    | I    | #    | 3.2B    | minu | Minimum Unsigned |
 374 | X    | I    | #    | 3.2B    | maxu | Maximum Unsigned |
 375 | X    | I    | #    | 3.2B    | min | Minimum |
 376 | X    | I    | #    | 3.2B    | max | Maximum |
 377
 378 [[!tag opf_rfc]]
 379