openpower/sv/rfc/ls013.mdwn

   1 # RFC ls013 Min/Max GPR/FPR
   2
   3 **URLs**:
   4
   5 * <https://libre-soc.org/openpower/sv/rfc/ls013/>
   6 * <https://git.openpower.foundation/isa/PowerISA/issues/TODO>
   7 * <https://bugs.libre-soc.org/show_bug.cgi?id=1057>
   8
   9 **Severity**: Major
  10
  11 **Status**: New
  12
  13 **Date**: 14 Apr 2023
  14
  15 **Target**: v3.2B
  16
  17 **Source**: v3.1B
  18
  19 **Books and Section affected**:
  20
  21 ```
  22     Book I Fixed-Point Instructions
  23     Appendix E Power ISA sorted by opcode
  24     Appendix F Power ISA sorted by version
  25     Appendix G Power ISA sorted by Compliancy Subset
  26     Appendix H Power ISA sorted by mnemonic
  27 ```
  28
  29 **Summary**
  30
  31 ```
  32     Instructions added
  33 ```
  34
  35 **Submitter**: Luke Leighton (Libre-SOC)
  36
  37 **Requester**: Libre-SOC
  38
  39 **Impact on processor**:
  40
  41 ```
  42     Addition of new GPR-based instructions
  43 ```
  44
  45 **Impact on software**:
  46
  47 ```
  48     Requires support for new instructions in assembler, debuggers,
  49     and related tools.
  50 ```
  51
  52 **Keywords**:
  53
  54 ```
  55     GPR, FPR, min, max, fmin, fmax
  56 ```
  57
  58 **Motivation**
  59
  60 TODO
  61
  62 **Notes and Observations**:
  63
  64 1. minimum/maximum instructions are needed for vector reductions, where the SVP64 tree reduction needs a single instruction to work properly.
  65 2. if you implement any of the FP min/max modes, the rest are not much more hardware.
  66 3. FP min/max are rather complex to implement in software, the most commonly used FP max function `fmax` from glibc compiled for SFFS is 32 (!) instructions.
  67
  68 https://gcc.godbolt.org/z/6xba61To6
  69
  70 ```
  71     fmax(double, double):
  72         fcmpu 0,1,2
  73         fmr 0,1
  74         cror 30,1,2
  75         beq 7,.L12
  76         blt 0,.L13
  77         stfd 1,-16(1)
  78         lis 9,0x8
  79         li 8,-1
  80         sldi 9,9,32
  81         rldicr 8,8,0,11
  82         ori 2,2,0
  83         ld 10,-16(1)
  84         xor 10,10,9
  85         sldi 10,10,1
  86         cmpld 0,10,8
  87         bgt 0,.L5
  88         stfd 2,-16(1)
  89         ori 2,2,0
  90         ld 10,-16(1)
  91         xor 9,10,9
  92         sldi 9,9,1
  93         cmpld 0,9,8
  94         ble 0,.L6
  95 .L5:
  96         fadd 1,0,2
  97         blr
  98 .L13:
  99         fmr 1,2
 100         blr
 101 .L6:
 102         fcmpu 0,2,2
 103         fmr 1,2
 104         bnulr 0
 105 .L12:
 106         fmr 1,0
 107         blr
 108         .long 0
 109         .byte 0,9,0,0,0,0,0,0
 110 ```
 111
 112 **Changes**
 113
 114 Add the following entries to:
 115
 116 * the Appendices of Book I
 117 * Book I 3.3.9 Fixed-Point Arithmetic Instructions
 118 * Book I 4.6.6.1 Floating-Point Elementary Arithmetic Instructions
 119 * Book I 1.6.1 and 1.6.2
 120
 121 ----------------
 122
 123 \newpage{}
 124
 125 ## `FMM` -- Floating Min/Max Mode
 126
 127 <a id="fmm-floating-min-max-mode"></a>
 128
 129 | `FMM` | Assembly Alias                | Origin                         | Semantics                                       |
 130 |-------|-------------------------------|--------------------------------|-------------------------------------------------|
 131 | 0000  | fminnum08[s] FRT, FRA, FRB    | IEEE 754-2008                  | FRT = minNum(FRA, FRB)  (1)                     |
 132 | 0001  | fmin19[s] FRT, FRA, FRB       | IEEE 754-2019                  | FRT = minimum(FRA, FRB)                         |
 133 | 0010  | fminnum19[s] FRT, FRA, FRB    | IEEE 754-2019                  | FRT = minimumNumber(FRA, FRB)                   |
 134 | 0011  | fminc[s] FRT, FRA, FRB        | x86 minss or Win32's min macro | FRT = FRA \< FRB ? FRA : FRB                    |
 135 | 0100  | fminmagnum08[s] FRT, FRA, FRB | IEEE 754-2008 (TODO: (3))      | FRT = minmaxmag(FRA, FRB, False, fminnum08) (2) |
 136 | 0101  | fminmag19[s] FRT, FRA, FRB    | IEEE 754-2019                  | FRT = minmaxmag(FRA, FRB, False, fmin19) (2)    |
 137 | 0110  | fminmagnum19[s] FRT, FRA, FRB | IEEE 754-2019                  | FRT = minmaxmag(FRA, FRB, False, fminnum19) (2) |
 138 | 0111  | fminmagc[s] FRT, FRA, FRB     | -                              | FRT = minmaxmag(FRA, FRB, False, fminc) (2)     |
 139 | 1000  | fmaxnum08[s] FRT, FRA, FRB    | IEEE 754-2008                  | FRT = maxNum(FRA, FRB)  (1)                     |
 140 | 1001  | fmax19[s] FRT, FRA, FRB       | IEEE 754-2019                  | FRT = maximum(FRA, FRB)                         |
 141 | 1010  | fmaxnum19[s] FRT, FRA, FRB    | IEEE 754-2019                  | FRT = maximumNumber(FRA, FRB)                   |
 142 | 1011  | fmaxc[s] FRT, FRA, FRB        | x86 maxss or Win32's max macro | FRT = FRA > FRB ? FRA : FRB                     |
 143 | 1100  | fmaxmagnum08[s] FRT, FRA, FRB | IEEE 754-2008 (TODO: (3))      | FRT = minmaxmag(FRA, FRB, True, fmaxnum08) (2)  |
 144 | 1101  | fmaxmag19[s] FRT, FRA, FRB    | IEEE 754-2019                  | FRT = minmaxmag(FRA, FRB, True, fmax19) (2)     |
 145 | 1110  | fmaxmagnum19[s] FRT, FRA, FRB | IEEE 754-2019                  | FRT = minmaxmag(FRA, FRB, True, fmaxnum19) (2)  |
 146 | 1111  | fmaxmagc[s] FRT, FRA, FRB     | -                              | FRT = minmaxmag(FRA, FRB, True, fmaxc) (2)      |
 147
 148 Note (1): for the purposes of minNum/maxNum, -0.0 is defined to be less than +0.0. This is left unspecified in IEEE 754-2008.
 149
 150 Note (2): minmaxmag(x, y, cmp, fallback) is defined as:
 151
 152 ```python
 153 def minmaxmag(x, y, is_max, fallback):
 154     a = abs(x) < abs(y)
 155     b = abs(x) > abs(y)
 156     if is_max:
 157         a, b = b, a  # swap
 158     if a:
 159         return x
 160     if b:
 161         return y
 162     # equal magnitudes, or NaN input(s)
 163     return fallback(x, y)
 164 ```
 165
 166 Note (3): TODO: icr if IEEE 754-2008 has min/maxMagNum like IEEE 754-2019's minimum/maximumMagnitudeNumber
 167
 168 ----------------
 169
 170 \newpage{}
 171
 172 ## Floating Minimum/Maximum X-Form
 173
 174 ```
 175     fminmax FRT, FRA, FRB, FMM
 176 ```
 177
 178 ```
 179     |0    |6    |11   |16   |21        |24  |31      |
 180     | PO  | FRT | FRA | FRB | FMM[0:2] | XO | FMM[3] |
 181 ```
 182
 183 Compute the minimum/maximum of FRA and FRB, according to FMM, and store the result in FRT.
 184
 185 Assembly Aliases: see [`FMM` -- Floating Min/Max Mode](#fmm-floating-min-max-mode)
 186
 187 ----------
 188
 189 ## Floating Minimum/Maximum Single X-Form
 190
 191 ```
 192     fminmaxs FRT, FRA, FRB, FMM
 193 ```
 194
 195 ```
 196     |0    |6    |11   |16   |21        |24  |31      |
 197     | PO  | FRT | FRA | FRB | FMM[0:2] | XO | FMM[3] |
 198 ```
 199
 200 Compute the minimum/maximum of FRA and FRB, according to FMM, and store the result in FRT.
 201
 202 Assembly Aliases: see [`FMM` -- Floating Min/Max Mode](#fmm-floating-min-max-mode)
 203
 204 ----------
 205
 206 \newpage{}
 207
 208 ## Minimum Unsigned X-Form
 209
 210 ```
 211     minu RT, RA, RB
 212     minu. RT, RA, RB
 213 ```
 214
 215 ```
 216     |0   |6   |11  |16   |21  |31  |
 217     | PO | RT | RA | RB  | XO | Rc |
 218 ```
 219
 220 ```
 221     if (RA) <u (RB) then
 222         RT <- (RA)
 223     else
 224         RT <- (RB)
 225 ```
 226
 227 Compute the unsigned minimum of RA and RB and store the result in RT.
 228
 229 Special Registers altered:
 230
 231 ```
 232     CR0     (if Rc=1)
 233 ```
 234
 235 ----------
 236
 237 ## Maximum Unsigned X-Form
 238
 239 ```
 240     maxu RT, RA, RB
 241     maxu. RT, RA, RB
 242 ```
 243
 244 ```
 245     |0   |6   |11  |16   |21  |31  |
 246     | PO | RT | RA | RB  | XO | Rc |
 247 ```
 248
 249 ```
 250     if (RA) >u (RB) then
 251         RT <- (RA)
 252     else
 253         RT <- (RB)
 254 ```
 255
 256 Compute the unsigned maximum of RA and RB and store the result in RT.
 257
 258 Special Registers altered:
 259
 260 ```
 261     CR0     (if Rc=1)
 262 ```
 263
 264 ----------
 265
 266 \newpage{}
 267
 268 ## Minimum X-Form
 269
 270 ```
 271     min RT, RA, RB
 272     min. RT, RA, RB
 273 ```
 274
 275 ```
 276     |0   |6   |11  |16   |21  |31  |
 277     | PO | RT | RA | RB  | XO | Rc |
 278 ```
 279
 280 ```
 281     if (RA) < (RB) then
 282         RT <- (RA)
 283     else
 284         RT <- (RB)
 285 ```
 286
 287 Compute the signed minimum of RA and RB and store the result in RT.
 288
 289 Special Registers altered:
 290
 291 ```
 292     CR0     (if Rc=1)
 293 ```
 294
 295 ----------
 296
 297 ## Maximum X-Form
 298
 299 ```
 300     max RT, RA, RB
 301     max. RT, RA, RB
 302 ```
 303
 304 ```
 305     |0   |6   |11  |16   |21  |31  |
 306     | PO | RT | RA | RB  | XO | Rc |
 307 ```
 308
 309 ```
 310     if (RA) > (RB) then
 311         RT <- (RA)
 312     else
 313         RT <- (RB)
 314 ```
 315
 316 Compute the signed maximum of RA and RB and store the result in RT.
 317
 318 Special Registers altered:
 319
 320 ```
 321     CR0     (if Rc=1)
 322 ```
 323
 324 ----------
 325
 326 \newpage{}
 327
 328 # Instruction Formats
 329
 330 Add the following entries to Book I 1.6.1.15 X-FORM:
 331
 332 ```
 333     |0    |6    |11   |16   |21        |24  |31      |
 334     | PO  | FRT | FRA | FRB | FMM[0:2] | XO | FMM[3] |
 335 ```
 336
 337 Add a new field to Book I 1.6.2 Word Instruction Fields:
 338
 339 ```
 340     FMM (21:23,31)
 341         Field used to specify minimum/maximum mode for fminmax[s].
 342
 343         Formats: X
 344 ```
 345
 346 ----------
 347
 348 \newpage{}
 349
 350 # Appendices
 351
 352     Appendix E Power ISA sorted by opcode
 353     Appendix F Power ISA sorted by version
 354     Appendix G Power ISA sorted by Compliancy Subset
 355     Appendix H Power ISA sorted by mnemonic
 356
 357 | Form | Book | Page | Version | mnemonic | Description |
 358 |------|------|------|---------|----------|-------------|
 359 | X    | I    | #    | 3.2B    | fminmax  | Floating Minimum/Maximum |
 360 | X    | I    | #    | 3.2B    | fminmaxs | Floating Minimum/Maximum Single |
 361 | X    | I    | #    | 3.2B    | minu | Minimum Unsigned |
 362 | X    | I    | #    | 3.2B    | maxu | Maximum Unsigned |
 363 | X    | I    | #    | 3.2B    | min | Minimum |
 364 | X    | I    | #    | 3.2B    | max | Maximum |
 365
 366 [[!tag opf_rfc]]
 367