openpower/sv/rfc/ls013.mdwn

   1 # RFC ls013 Min/Max GPR/FPR
   2
   3 **URLs**:
   4
   5 * <https://libre-soc.org/openpower/sv/rfc/ls013/>
   6 * <https://git.openpower.foundation/isa/PowerISA/issues/TODO>
   7 * <https://bugs.libre-soc.org/show_bug.cgi?id=1057>
   8
   9 **Severity**: Major
  10
  11 **Status**: New
  12
  13 **Date**: 14 Apr 2023
  14
  15 **Target**: v3.2B
  16
  17 **Source**: v3.1B
  18
  19 **Books and Section affected**:
  20
  21 ```
  22     Book I Fixed-Point Instructions
  23     Appendix E Power ISA sorted by opcode
  24     Appendix F Power ISA sorted by version
  25     Appendix G Power ISA sorted by Compliancy Subset
  26     Appendix H Power ISA sorted by mnemonic
  27 ```
  28
  29 **Summary**
  30
  31 ```
  32     Instructions added
  33 ```
  34
  35 **Submitter**: Luke Leighton (Libre-SOC)
  36
  37 **Requester**: Libre-SOC
  38
  39 **Impact on processor**:
  40
  41 ```
  42     Addition of new GPR-based instructions
  43 ```
  44
  45 **Impact on software**:
  46
  47 ```
  48     Requires support for new instructions in assembler, debuggers,
  49     and related tools.
  50 ```
  51
  52 **Keywords**:
  53
  54 ```
  55     GPR, FPR, min, max, fmin, fmax
  56 ```
  57
  58 **Motivation**
  59
  60 TODO
  61
  62 **Notes and Observations**:
  63
  64 1. minimum/maximum instructions are needed for vector reductions, where the
  65     SVP64 tree reduction needs a single instruction to work properly.
  66 2. if you implement any of the FP min/max modes, the rest are not much more
  67     hardware.
  68 3. FP min/max are rather complex to implement in software, the most commonly
  69     used FP max function `fmax` from glibc compiled for SFFS is 32 (!)
  70     instructions.
  71
  72 https://gcc.godbolt.org/z/6xba61To6
  73
  74 ```
  75     fmax(double, double):
  76         fcmpu 0,1,2
  77         fmr 0,1
  78         cror 30,1,2
  79         beq 7,.L12
  80         blt 0,.L13
  81         stfd 1,-16(1)
  82         lis 9,0x8
  83         li 8,-1
  84         sldi 9,9,32
  85         rldicr 8,8,0,11
  86         ori 2,2,0
  87         ld 10,-16(1)
  88         xor 10,10,9
  89         sldi 10,10,1
  90         cmpld 0,10,8
  91         bgt 0,.L5
  92         stfd 2,-16(1)
  93         ori 2,2,0
  94         ld 10,-16(1)
  95         xor 9,10,9
  96         sldi 9,9,1
  97         cmpld 0,9,8
  98         ble 0,.L6
  99 .L5:
 100         fadd 1,0,2
 101         blr
 102 .L13:
 103         fmr 1,2
 104         blr
 105 .L6:
 106         fcmpu 0,2,2
 107         fmr 1,2
 108         bnulr 0
 109 .L12:
 110         fmr 1,0
 111         blr
 112         .long 0
 113         .byte 0,9,0,0,0,0,0,0
 114 ```
 115
 116 **Changes**
 117
 118 Add the following entries to:
 119
 120 * the Appendices of Book I
 121 * Book I 3.3.9 Fixed-Point Arithmetic Instructions
 122 * Book I 4.6.6.1 Floating-Point Elementary Arithmetic Instructions
 123 * Book I 1.6.1 and 1.6.2
 124
 125 ----------------
 126
 127 \newpage{}
 128
 129 ## `FMM` -- Floating Min/Max Mode
 130
 131 <a id="fmm-floating-min-max-mode"></a>
 132
 133 | `FMM` | Assembly Alias                | Origin                         | Semantics                                       |
 134 |-------|-------------------------------|--------------------------------|-------------------------------------------------|
 135 | 0000  | fminnum08[s] FRT, FRA, FRB    | IEEE 754-2008                  | FRT = minNum(FRA, FRB)  (1)                     |
 136 | 0001  | fmin19[s] FRT, FRA, FRB       | IEEE 754-2019                  | FRT = minimum(FRA, FRB)                         |
 137 | 0010  | fminnum19[s] FRT, FRA, FRB    | IEEE 754-2019                  | FRT = minimumNumber(FRA, FRB)                   |
 138 | 0011  | fminc[s] FRT, FRA, FRB        | x86 minss or Win32's min macro | FRT = FRA \< FRB ? FRA : FRB                    |
 139 | 0100  | fminmagnum08[s] FRT, FRA, FRB | IEEE 754-2008 (TODO: (3))      | FRT = minmaxmag(FRA, FRB, False, fminnum08) (2) |
 140 | 0101  | fminmag19[s] FRT, FRA, FRB    | IEEE 754-2019                  | FRT = minmaxmag(FRA, FRB, False, fmin19) (2)    |
 141 | 0110  | fminmagnum19[s] FRT, FRA, FRB | IEEE 754-2019                  | FRT = minmaxmag(FRA, FRB, False, fminnum19) (2) |
 142 | 0111  | fminmagc[s] FRT, FRA, FRB     | -                              | FRT = minmaxmag(FRA, FRB, False, fminc) (2)     |
 143 | 1000  | fmaxnum08[s] FRT, FRA, FRB    | IEEE 754-2008                  | FRT = maxNum(FRA, FRB)  (1)                     |
 144 | 1001  | fmax19[s] FRT, FRA, FRB       | IEEE 754-2019                  | FRT = maximum(FRA, FRB)                         |
 145 | 1010  | fmaxnum19[s] FRT, FRA, FRB    | IEEE 754-2019                  | FRT = maximumNumber(FRA, FRB)                   |
 146 | 1011  | fmaxc[s] FRT, FRA, FRB        | x86 maxss or Win32's max macro | FRT = FRA > FRB ? FRA : FRB                     |
 147 | 1100  | fmaxmagnum08[s] FRT, FRA, FRB | IEEE 754-2008 (TODO: (3))      | FRT = minmaxmag(FRA, FRB, True, fmaxnum08) (2)  |
 148 | 1101  | fmaxmag19[s] FRT, FRA, FRB    | IEEE 754-2019                  | FRT = minmaxmag(FRA, FRB, True, fmax19) (2)     |
 149 | 1110  | fmaxmagnum19[s] FRT, FRA, FRB | IEEE 754-2019                  | FRT = minmaxmag(FRA, FRB, True, fmaxnum19) (2)  |
 150 | 1111  | fmaxmagc[s] FRT, FRA, FRB     | -                              | FRT = minmaxmag(FRA, FRB, True, fmaxc) (2)      |
 151
 152 Note (1): for the purposes of minNum/maxNum, -0.0 is defined to be less than
 153     +0.0. This is left unspecified in IEEE 754-2008.
 154
 155 Note (2): minmaxmag(x, y, cmp, fallback) is defined as:
 156
 157 ```python
 158 def minmaxmag(x, y, is_max, fallback):
 159     a = abs(x) < abs(y)
 160     b = abs(x) > abs(y)
 161     if is_max:
 162         a, b = b, a  # swap
 163     if a:
 164         return x
 165     if b:
 166         return y
 167     # equal magnitudes, or NaN input(s)
 168     return fallback(x, y)
 169 ```
 170
 171 Note (3): TODO: icr if IEEE 754-2008 has min/maxMagNum like IEEE 754-2019's
 172     minimum/maximumMagnitudeNumber
 173
 174 ----------------
 175
 176 \newpage{}
 177
 178 ## Floating Minimum/Maximum X-Form
 179
 180 ```
 181     fminmax FRT, FRA, FRB, FMM
 182 ```
 183
 184 ```
 185     |0    |6    |11   |16   |21        |24  |31      |
 186     | PO  | FRT | FRA | FRB | FMM[0:2] | XO | FMM[3] |
 187 ```
 188
 189 Compute the minimum/maximum of FRA and FRB, according to FMM, and store the
 190 result in FRT.
 191
 192 Assembly Aliases: see
 193 [`FMM` -- Floating Min/Max Mode](#fmm-floating-min-max-mode)
 194
 195 ----------
 196
 197 ## Floating Minimum/Maximum Single X-Form
 198
 199 ```
 200     fminmaxs FRT, FRA, FRB, FMM
 201 ```
 202
 203 ```
 204     |0    |6    |11   |16   |21        |24  |31      |
 205     | PO  | FRT | FRA | FRB | FMM[0:2] | XO | FMM[3] |
 206 ```
 207
 208 Compute the minimum/maximum of FRA and FRB, according to FMM, and store the
 209 result in FRT.
 210
 211 Assembly Aliases: see
 212 [`FMM` -- Floating Min/Max Mode](#fmm-floating-min-max-mode)
 213
 214 ----------
 215
 216 \newpage{}
 217
 218 ## Minimum Unsigned X-Form
 219
 220 ```
 221     minu RT, RA, RB
 222     minu. RT, RA, RB
 223 ```
 224
 225 ```
 226     |0   |6   |11  |16   |21  |31  |
 227     | PO | RT | RA | RB  | XO | Rc |
 228 ```
 229
 230 ```
 231     if (RA) <u (RB) then
 232         RT <- (RA)
 233     else
 234         RT <- (RB)
 235 ```
 236
 237 Compute the unsigned minimum of RA and RB and store the result in RT.
 238
 239 Special Registers altered:
 240
 241 ```
 242     CR0     (if Rc=1)
 243 ```
 244
 245 ----------
 246
 247 ## Maximum Unsigned X-Form
 248
 249 ```
 250     maxu RT, RA, RB
 251     maxu. RT, RA, RB
 252 ```
 253
 254 ```
 255     |0   |6   |11  |16   |21  |31  |
 256     | PO | RT | RA | RB  | XO | Rc |
 257 ```
 258
 259 ```
 260     if (RA) >u (RB) then
 261         RT <- (RA)
 262     else
 263         RT <- (RB)
 264 ```
 265
 266 Compute the unsigned maximum of RA and RB and store the result in RT.
 267
 268 Special Registers altered:
 269
 270 ```
 271     CR0     (if Rc=1)
 272 ```
 273
 274 ----------
 275
 276 \newpage{}
 277
 278 ## Minimum X-Form
 279
 280 ```
 281     min RT, RA, RB
 282     min. RT, RA, RB
 283 ```
 284
 285 ```
 286     |0   |6   |11  |16   |21  |31  |
 287     | PO | RT | RA | RB  | XO | Rc |
 288 ```
 289
 290 ```
 291     if (RA) < (RB) then
 292         RT <- (RA)
 293     else
 294         RT <- (RB)
 295 ```
 296
 297 Compute the signed minimum of RA and RB and store the result in RT.
 298
 299 Special Registers altered:
 300
 301 ```
 302     CR0     (if Rc=1)
 303 ```
 304
 305 ----------
 306
 307 ## Maximum X-Form
 308
 309 ```
 310     max RT, RA, RB
 311     max. RT, RA, RB
 312 ```
 313
 314 ```
 315     |0   |6   |11  |16   |21  |31  |
 316     | PO | RT | RA | RB  | XO | Rc |
 317 ```
 318
 319 ```
 320     if (RA) > (RB) then
 321         RT <- (RA)
 322     else
 323         RT <- (RB)
 324 ```
 325
 326 Compute the signed maximum of RA and RB and store the result in RT.
 327
 328 Special Registers altered:
 329
 330 ```
 331     CR0     (if Rc=1)
 332 ```
 333
 334 ----------
 335
 336 \newpage{}
 337
 338 # Instruction Formats
 339
 340 Add the following entries to Book I 1.6.1.15 X-FORM:
 341
 342 ```
 343     |0    |6    |11   |16   |21        |24  |31      |
 344     | PO  | FRT | FRA | FRB | FMM[0:2] | XO | FMM[3] |
 345 ```
 346
 347 Add a new field to Book I 1.6.2 Word Instruction Fields:
 348
 349 ```
 350     FMM (21:23,31)
 351         Field used to specify minimum/maximum mode for fminmax[s].
 352
 353         Formats: X
 354 ```
 355
 356 ----------
 357
 358 \newpage{}
 359
 360 # Appendices
 361
 362     Appendix E Power ISA sorted by opcode
 363     Appendix F Power ISA sorted by version
 364     Appendix G Power ISA sorted by Compliancy Subset
 365     Appendix H Power ISA sorted by mnemonic
 366
 367 | Form | Book | Page | Version | mnemonic | Description |
 368 |------|------|------|---------|----------|-------------|
 369 | X    | I    | #    | 3.2B    | fminmax  | Floating Minimum/Maximum |
 370 | X    | I    | #    | 3.2B    | fminmaxs | Floating Minimum/Maximum Single |
 371 | X    | I    | #    | 3.2B    | minu | Minimum Unsigned |
 372 | X    | I    | #    | 3.2B    | maxu | Maximum Unsigned |
 373 | X    | I    | #    | 3.2B    | min | Minimum |
 374 | X    | I    | #    | 3.2B    | max | Maximum |
 375
 376 [[!tag opf_rfc]]
 377