openpower/sv/rfc/ls006.mdwn

   1 # RFC ls006 FPR <-> GPR Move/Conversion
   2
   3 **URLs**:
   4
   5 * <https://libre-soc.org/openpower/sv/int_fp_mv/>
   6 * <https://libre-soc.org/openpower/sv/rfc/ls006/>
   7 * <https://bugs.libre-soc.org/show_bug.cgi?id=1015>
   8 * <https://git.openpower.foundation/isa/PowerISA/issues/todo>
   9
  10 **Severity**: Major
  11
  12 **Status**: New
  13
  14 **Date**: 20 Oct 2022
  15
  16 **Target**: v3.2B
  17
  18 **Source**: v3.1B
  19
  20 **Books and Section affected**: **UPDATE**
  21
  22 * Book I 4.6.5 Floating-Point Move Instructions
  23 * Book I 4.6.7.2 Floating-Point Convert To/From Integer Instructions
  24 * Appendix E Power ISA sorted by opcode
  25 * Appendix F Power ISA sorted by version
  26 * Appendix G Power ISA sorted by Compliancy Subset
  27 * Appendix H Power ISA sorted by mnemonic
  28
  29 **Summary**
  30
  31 Single-precision Instructions added:
  32
  33 * `fmvtgs` -- Single-Precision Floating Move To GPR
  34 * `fmvfgs` -- Single-Precision Floating Move From GPR
  35 * `fcvttgs` -- Single-Precision Floating Convert To Integer In GPR
  36 * `fcvtfgs` -- Single-Precision Floating Convert From Integer In GPR
  37
  38 Identical (except Double-precision) Instructions added:
  39
  40 * `fmvtg` -- Double-Precision Floating Move To GPR
  41 * `fmvfg` -- Double-Precision Floating Move From GPR
  42 * `fcvttg` -- Double-Precision Floating Convert To Integer In GPR
  43 * `fcvtfg` -- Double-Precision Floating Convert From Integer In GPR
  44
  45 **Submitter**: Luke Leighton (Libre-SOC)
  46
  47 **Requester**: Libre-SOC
  48
  49 **Impact on processor**:
  50
  51 * Addition of four new Single-Precision GPR-FPR-based instructions
  52 * Addition of four new Double-Precision GPR-FPR-based instructions
  53
  54 **Impact on software**:
  55
  56 * Requires support for new instructions in assembler, debuggers,
  57   and related tools.
  58
  59 **Keywords**:
  60
  61 ```
  62     GPR, FPR, Move, Conversion, JavaScript
  63 ```
  64
  65 **Motivation**
  66
  67 CPUs without VSX/VMX lack a way to efficiently transfer data between
  68 FPRs and GPRs, they need to go through memory, this proposal adds more
  69 efficient data transfer (both bitwise copy and Integer <-> FP conversion)
  70 instructions that transfer directly between FPRs and GPRs without needing
  71 to go through memory.
  72
  73 IEEE 754 doesn't specify what results are obtained when converting a NaN
  74 or out-of-range floating-point value to integer, so different programming
  75 languages and ISAs have made different choices.  Below is an overview
  76 of the different variants, listing the languages and hardware that
  77 implements each variant.
  78
  79 **Notes and Observations**:
  80
  81 * These instructions are present in many other ISAs.
  82 * JavaScript rounding as one instruction saves 35 instructions including
  83   six branches. (FIXME: disagrees with int_fp_mv and int_fp_mv/appendix)
  84 * Both sets are orthogonal (no difference except being Single/Double).
  85   This allows IBM to follow the pre-existing precedent of allocating
  86   separate Major Opcodes (PO) for Double-precision and Single-precision
  87   respectively.
  88
  89 **Changes**
  90
  91 Add the following entries to:
  92
  93 * Book I 4.6.5 Floating-Point Move Instructions
  94 * Book I 4.6.7.2 Floating-Point Convert To/From Integer Instructions
  95 * Book I 1.6.1 and 1.6.2
  96
  97 ----------------
  98
  99 \newpage{}
 100
 101 # Immediate Tables
 102
 103 Tables that are used by
 104 `fmvtg[s][.]`/`fmvfg[s][.]`/`fcvt[s]tg[o][.]`/`fcvtfg[s][.]`:
 105
 106 ## `IT` -- Integer Type
 107
 108 | `IT` | Integer Type    | Assembly Alias Mnemonic |
 109 |------|-----------------|-------------------------|
 110 | 0    | Signed 32-bit   | `<op>w`                 |
 111 | 1    | Unsigned 32-bit | `<op>uw`                |
 112 | 2    | Signed 64-bit   | `<op>d`                 |
 113 | 3    | Unsigned 64-bit | `<op>ud`                |
 114
 115 ## `CVM` -- Float to Integer Conversion Mode
 116
 117 | `CVM` | `rounding_mode` | Semantics                        |
 118 |-------|-----------------|----------------------------------|
 119 | 000   | from `FPSCR`    | [OpenPower semantics]            |
 120 | 001   | Truncate        | [OpenPower semantics]            |
 121 | 010   | from `FPSCR`    | [Java/Saturating semantics]      |
 122 | 011   | Truncate        | [Java/Saturating semantics]      |
 123 | 100   | from `FPSCR`    | [JavaScript semantics]           |
 124 | 101   | Truncate        | [JavaScript semantics]           |
 125 | rest  | --              | illegal instruction trap for now |
 126
 127 [OpenPower semantics]: #fp-to-int-openpower-conversion-semantics
 128 [Java/Saturating semantics]: #fp-to-int-java-saturating-conversion-semantics
 129 [JavaScript semantics]: #fp-to-int-javascript-conversion-semantics
 130
 131 ----------
 132
 133 \newpage{}
 134
 135 ## Floating Move To GPR
 136
 137 ```
 138     fmvtg RT, FRB
 139     fmvtg. RT, FRB
 140 ```
 141
 142 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form   |
 143 |-----|------|-------|-------|-------|----|--------|
 144 | PO  | RT   | 0     | FRB   | XO    | Rc | X-Form |
 145
 146 ```
 147     RT <- (FRB)
 148 ```
 149
 150 Move a 64-bit float from a FPR to a GPR, just copying bits of the IEEE 754
 151 representation directly. This is equivalent to `stfd` followed by `ld`.
 152 As `fmvtg` is just copying bits, `FPSCR` is not affected in any way.
 153
 154 Rc=1 tests RT and sets CR0, exactly like all other Scalar Fixed-Point
 155 operations.
 156
 157 Special Registers altered:
 158
 159     CR0     (if Rc=1)
 160
 161 ----------
 162
 163 \newpage{}
 164
 165 ## Floating Move To GPR Single
 166
 167 ```
 168     fmvtgs RT, FRB
 169     fmvtgs. RT, FRB
 170 ```
 171
 172 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form   |
 173 |-----|------|-------|-------|-------|----|--------|
 174 | PO  | RT   | 0     | FRB   | XO    | Rc | X-Form |
 175
 176 ```
 177     RT <- [0] * 32 || SINGLE((FRB))  # SINGLE since that's what stfs uses
 178 ```
 179
 180 Move a 32-bit float from a FPR to a GPR, just copying bits of the IEEE 754
 181 representation directly. This is equivalent to `stfs` followed by `lwz`.
 182 As `fmvtgs` is just copying bits, `FPSCR` is not affected in any way.
 183
 184 Rc=1 tests RT and sets CR0, exactly like all other Scalar Fixed-Point
 185 operations.
 186
 187 Special Registers altered:
 188
 189     CR0     (if Rc=1)
 190
 191 ----------
 192
 193 \newpage{}
 194
 195 ## Double-Precision Floating Move From GPR
 196
 197 ```
 198     fmvfg FRT, RB
 199     fmvfg. FRT, RB
 200 ```
 201
 202 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form   |
 203 |-----|------|-------|-------|-------|----|--------|
 204 | PO  | FRT  | 0     | RB    | XO    | Rc | X-Form |
 205
 206 ```
 207     FRT <- (RB)
 208 ```
 209
 210 move a 64-bit float from a GPR to a FPR, just copying bits of the IEEE 754
 211 representation directly. This is equivalent to `std` followed by `lfd`.
 212 As `fmvfg` is just copying bits, `FPSCR` is not affected in any way.
 213
 214 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
 215 operations.
 216
 217 Special Registers altered:
 218
 219     CR1     (if Rc=1)
 220
 221 ----------
 222
 223 \newpage{}
 224
 225 ## Floating Move From GPR Single
 226
 227 ```
 228     fmvfgs FRT, RB
 229     fmvfgs. FRT, RB
 230 ```
 231
 232 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form   |
 233 |-----|------|-------|-------|-------|----|--------|
 234 | PO  | FRT  | 0     | RB    | XO    | Rc | X-Form |
 235
 236 ```
 237     FRT <- DOUBLE((RB)[32:63])  # DOUBLE since that's what lfs uses
 238 ```
 239
 240 move a 32-bit float from a GPR to a FPR, just copying bits of the IEEE 754
 241 representation directly. This is equivalent to `stw` followed by `lfs`.
 242 As `fmvfgs` is just copying bits, `FPSCR` is not affected in any way.
 243
 244 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
 245 operations.
 246
 247 Special Registers altered:
 248
 249     CR1     (if Rc=1)
 250
 251 ----------
 252
 253 \newpage{}
 254
 255 ## Double-Precision Floating Convert From Integer In GPR
 256
 257 ```
 258     fcvtfg FRT, RB, IT
 259     fcvtfg. FRT, RB, IT
 260 ```
 261
 262 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-30 | 31 | Form   |
 263 |-----|------|-------|-------|-------|-------|----|--------|
 264 | PO  | FRT  | IT    | 0     | RB    | XO    | Rc | X-Form |
 265
 266 ```
 267     if IT[0] = 0 then  # 32-bit int -> 64-bit float
 268         # rounding never necessary, so don't touch FPSCR
 269         # based off xvcvsxwdp
 270         if IT = 0 then  # Signed 32-bit
 271             src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
 272         else  # IT = 1 -- Unsigned 32-bit
 273             src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
 274         FRT <- bfp64_CONVERT_FROM_BFP(src)
 275     else
 276         # rounding may be necessary. based off xscvuxdsp
 277         reset_xflags()
 278         switch(IT)
 279             case(0):  # Signed 32-bit
 280                 src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
 281             case(1):  # Unsigned 32-bit
 282                 src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
 283             case(2):  # Signed 64-bit
 284                 src <- bfp_CONVERT_FROM_SI64((RB))
 285             default:  # Unsigned 64-bit
 286                 src <- bfp_CONVERT_FROM_UI64((RB))
 287         rnd <- bfp_ROUND_TO_BFP64(FPSCR.RN, src)
 288         result <- bfp64_CONVERT_FROM_BFP(rnd)
 289         cls <- fprf_CLASS_BFP64(result)
 290
 291         if xx_flag = 1 then SetFX(FPSCR.XX)
 292
 293         FRT <- result
 294         FPSCR.FPRF <- cls
 295         FPSCR.FR <- inc_flag
 296         FPSCR.FI <- xx_flag
 297 ```
 298 <!-- note the PowerISA spec. explicitly has empty lines before/after SetFX,
 299 don't remove them -->
 300
 301 Convert from a unsigned/signed 32/64-bit integer in RB to a 64-bit
 302 float in FRT.
 303
 304 If converting from a unsigned/signed 32-bit integer to a 64-bit float,
 305 rounding is never necessary, so `FPSCR` is unmodified and exceptions are
 306 never raised. Otherwise, `FPSCR` is modified and exceptions are raised
 307 as usual.
 308
 309 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
 310 operations.
 311
 312 Special Registers altered:
 313
 314     CR1     (if Rc=1)
 315     FPCSR   (TODO: which bits?) (if IT[0]=1)
 316
 317 ### Assembly Aliases
 318
 319 | Assembly Alias       | Full Instruction     |&nbsp;| Assembly Alias       | Full Instruction     |
 320 |----------------------|----------------------|------|----------------------|----------------------|
 321 | `fcvtfgw FRT, RB`    | `fcvtfg FRT, RB, 0`  |&nbsp;| `fcvtfgd FRT, RB`    | `fcvtfg FRT, RB, 2`  |
 322 | `fcvtfgw. FRT, RB`   | `fcvtfg. FRT, RB, 0` |&nbsp;| `fcvtfgd. FRT, RB`   | `fcvtfg. FRT, RB, 2` |
 323 | `fcvtfguw FRT, RB`   | `fcvtfg FRT, RB, 1`  |&nbsp;| `fcvtfgud FRT, RB`   | `fcvtfg FRT, RB, 3`  |
 324 | `fcvtfguw. FRT, RB`  | `fcvtfg. FRT, RB, 1` |&nbsp;| `fcvtfgud. FRT, RB`  | `fcvtfg. FRT, RB, 3` |
 325
 326 ----------
 327
 328 \newpage{}
 329
 330 ## Floating Convert From Integer In GPR Single
 331
 332 ```
 333     fcvtfgs FRT, RB, IT
 334     fcvtfgs. FRT, RB, IT
 335 ```
 336
 337 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-30 | 31 | Form   |
 338 |-----|------|-------|-------|-------|-------|----|--------|
 339 | PO  | FRT  | IT    | 0     | RB    | XO    | Rc | X-Form |
 340
 341 ```
 342     # rounding may be necessary. based off xscvuxdsp
 343     reset_xflags()
 344     switch(IT)
 345         case(0):  # Signed 32-bit
 346             src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
 347         case(1):  # Unsigned 32-bit
 348             src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
 349         case(2):  # Signed 64-bit
 350             src <- bfp_CONVERT_FROM_SI64((RB))
 351         default:  # Unsigned 64-bit
 352             src <- bfp_CONVERT_FROM_UI64((RB))
 353     rnd <- bfp_ROUND_TO_BFP32(FPSCR.RN, src)
 354     result32 <- bfp32_CONVERT_FROM_BFP(rnd)
 355     cls <- fprf_CLASS_BFP32(result32)
 356     result <- DOUBLE(result32)
 357
 358     if xx_flag = 1 then SetFX(FPSCR.XX)
 359
 360     FRT <- result
 361     FPSCR.FPRF <- cls
 362     FPSCR.FR <- inc_flag
 363     FPSCR.FI <- xx_flag
 364 ```
 365 <!-- note the PowerISA spec. explicitly has empty lines before/after SetFX,
 366 don't remove them -->
 367
 368 Convert from a unsigned/signed 32/64-bit integer in RB to a 32-bit
 369 float in FRT, following the usual 32-bit float in 64-bit float format.
 370 `FPSCR` is modified and exceptions are raised as usual.
 371
 372 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
 373 operations.
 374
 375 Special Registers altered:
 376
 377     CR1     (if Rc=1)
 378     FPCSR   (TODO: which bits?)
 379
 380 ### Assembly Aliases
 381
 382 | Assembly Alias       | Full Instruction     |&nbsp;| Assembly Alias       | Full Instruction     |
 383 |----------------------|----------------------|------|----------------------|----------------------|
 384 | `fcvtfgws FRT, RB`   | `fcvtfg FRT, RB, 0`  |&nbsp;| `fcvtfgds FRT, RB`   | `fcvtfg FRT, RB, 2`  |
 385 | `fcvtfgws. FRT, RB`  | `fcvtfg. FRT, RB, 0` |&nbsp;| `fcvtfgds. FRT, RB`  | `fcvtfg. FRT, RB, 2` |
 386 | `fcvtfguws FRT, RB`  | `fcvtfg FRT, RB, 1`  |&nbsp;| `fcvtfguds FRT, RB`  | `fcvtfg FRT, RB, 3`  |
 387 | `fcvtfguws. FRT, RB` | `fcvtfg. FRT, RB, 1` |&nbsp;| `fcvtfguds. FRT, RB` | `fcvtfg. FRT, RB, 3` |
 388
 389 ----------
 390
 391 \newpage{}
 392
 393 ## Floating-point to Integer Conversion Overview
 394
 395 <div id="fpr-to-gpr-conversion-mode"></div>
 396
 397 IEEE 754 doesn't specify what results are obtained when converting a NaN
 398 or out-of-range floating-point value to integer, so different programming
 399 languages and ISAs have made different choices.  Below is an overview
 400 of the different variants, listing the languages and hardware that
 401 implements each variant.
 402
 403 For convenience, we will give those different conversion semantics names
 404 based on which common ISA or programming language uses them, since there
 405 may not be an established name for them:
 406
 407 **Standard OpenPower conversion**
 408
 409 This conversion performs "saturation with NaN converted to minimum
 410 valid integer". This is also exactly the same as the x86 ISA conversion
 411 semantics.  OpenPOWER however has instructions for both:
 412
 413 * rounding mode read from FPSCR
 414 * rounding mode always set to truncate
 415
 416 **Java/Saturating conversion**
 417
 418 For the sake of simplicity, the FP -> Integer conversion semantics
 419 generalized from those used by Java's semantics (and Rust's `as`
 420 operator) will be referred to as [Java/Saturating conversion
 421 semantics](#fp-to-int-java-saturating-conversion-semantics).
 422
 423 Those same semantics are used in some way by all of the following
 424 languages (not necessarily for the default conversion method):
 425
 426 * Java's
 427   [FP -> Integer conversion](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
 428   (only for long/int results)
 429 * Rust's FP -> Integer conversion using the
 430   [`as` operator](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
 431 * LLVM's
 432   [`llvm.fptosi.sat`](https://llvm.org/docs/LangRef.html#llvm-fptosi-sat-intrinsic) and
 433   [`llvm.fptoui.sat`](https://llvm.org/docs/LangRef.html#llvm-fptoui-sat-intrinsic) intrinsics
 434 * SPIR-V's OpenCL dialect's
 435   [`OpConvertFToU`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToU) and
 436   [`OpConvertFToS`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToS)
 437   instructions when decorated with
 438   [the `SaturatedConversion` decorator](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#_a_id_decoration_a_decoration).
 439 * WebAssembly has also introduced
 440  [trunc_sat_u](ttps://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-u) and
 441  [trunc_sat_s](https://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-s)
 442
 443 **JavaScript conversion**
 444
 445 For the sake of simplicity, the FP -> Integer conversion
 446 semantics generalized from those used by JavaScripts's `ToInt32`
 447 abstract operation will be referred to as [JavaScript conversion
 448 semantics](#fp-to-int-javascript-conversion-semantics).
 449
 450 This instruction is present in ARM assembler as FJCVTZS
 451 <https://developer.arm.com/documentation/dui0801/g/hko1477562192868>
 452
 453 **Rc=1 and OE=1**
 454
 455 All of these instructions have an Rc=1 mode which sets CR0
 456 in the normal way for any instructions producing a GPR result.
 457 Additionally, when OE=1, if the numerical value of the FP number
 458 is not 100% accurately preserved (due to truncation or saturation
 459 and including when the FP number was NaN) then this is considered
 460 to be an integer Overflow condition, and CR0.SO, XER.SO and XER.OV
 461 are all set as normal for any GPR instructions that overflow.
 462
 463 \newpage{}
 464
 465 ### FP to Integer Conversion Simplified Pseudo-code
 466
 467 Key for pseudo-code:
 468
 469 | term                      | result type | definition                                                                                         |
 470 |---------------------------|-------------|----------------------------------------------------------------------------------------------------|
 471 | `fp`                      | --          | `f32` or `f64` (or other types from SimpleV)                                                       |
 472 | `int`                     | --          | `u32`/`u64`/`i32`/`i64` (or other types from SimpleV)                                              |
 473 | `uint`                    | --          | the unsigned integer of the same bit-width as `int`                                                |
 474 | `int::BITS`               | `int`       | the bit-width of `int`                                                                             |
 475 | `uint::MIN_VALUE`         | `uint`      | the minimum value `uint` can store: `0`                   |
 476 | `uint::MAX_VALUE`          | `uint`       | the maximum value `uint` can store: `2^int::BITS - 1`  |
 477 | `int::MIN_VALUE`          | `int`       | the minimum value `int` can store : `-2^(int::BITS-1)`              |
 478 | `int::MAX_VALUE`          | `int`       | the maximum value `int` can store :  `2^(int::BITS-1) - 1`  |
 479 | `int::VALUE_COUNT`        | Integer     | the number of different values `int` can store (`2^int::BITS`). too big to fit in `int`.           |
 480 | `rint(fp, rounding_mode)` | `fp`        | rounds the floating-point value `fp` to an integer according to rounding mode `rounding_mode`      |
 481
 482 <div id="fp-to-int-openpower-conversion-semantics"></div>
 483 OpenPower conversion semantics (section A.2 page 1009 (page 1035) of
 484 Power ISA v3.1B):
 485
 486 ```
 487     def fp_to_int_open_power<fp, int>(v: fp) -> int:
 488         if v is NaN:
 489             return int::MIN_VALUE
 490         if v >= int::MAX_VALUE:
 491             return int::MAX_VALUE
 492         if v <= int::MIN_VALUE:
 493             return int::MIN_VALUE
 494         return (int)rint(v, rounding_mode)
 495 ```
 496
 497 <div id="fp-to-int-java-saturating-conversion-semantics"></div>
 498 [Java/Saturating conversion semantics](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
 499 (only for long/int results)/
 500 [Rust semantics](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
 501 (with adjustment to add non-truncate rounding modes):
 502
 503 ```
 504     def fp_to_int_java_saturating<fp, int>(v: fp) -> int:
 505         if v is NaN:
 506             return 0
 507         if v >= int::MAX_VALUE:
 508             return int::MAX_VALUE
 509         if v <= int::MIN_VALUE:
 510             return int::MIN_VALUE
 511         return (int)rint(v, rounding_mode)
 512 ```
 513
 514 <div id="fp-to-int-javascript-conversion-semantics"></div>
 515 Section 7.1 of the ECMAScript / JavaScript
 516 [conversion semantics](https://262.ecma-international.org/11.0/#sec-toint32)
 517 (with adjustment to add non-truncate rounding modes):
 518
 519 ```
 520     def fp_to_int_java_script<fp, int>(v: fp) -> int:
 521         if v is NaN or infinite:
 522             return 0
 523         v = rint(v, rounding_mode)  # assume no loss of precision in result
 524         v = v mod int::VALUE_COUNT  # 2^32 for i32, 2^64 for i64, result is non-negative
 525         bits = (uint)v
 526         return (int)bits
 527 ```
 528
 529
 530 ----------
 531
 532 \newpage{}
 533
 534
 535 ## Double-Precision Floating Convert To Integer In GPR
 536
 537 ```
 538     fcvttg RT, FRB, CVM, IT
 539     fcvttg. RT, FRB, CVM, IT
 540     fcvttgo RT, FRB, CVM, IT
 541     fcvttgo. RT, FRB, CVM, IT
 542 ```
 543
 544 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-29 | 30 | 31 | Form    |
 545 |-----|------|-------|-------|-------|-------|----|----|---------|
 546 | PO  | RT   | IT    | CVM   | FRB   | XO    | OE | Rc | XO-Form |
 547
 548 ```
 549     # based on xscvdpuxws
 550     reset_xflags()
 551     src <- bfp_CONVERT_FROM_BFP64((FRB))
 552
 553     switch(IT)
 554         case(0):  # Signed 32-bit
 555             range_min <- bfp_CONVERT_FROM_SI32(0x8000_0000)
 556             range_max <- bfp_CONVERT_FROM_SI32(0x7FFF_FFFF)
 557             js_mask <- 0xFFFF_FFFF
 558         case(1):  # Unsigned 32-bit
 559             range_min <- bfp_CONVERT_FROM_UI32(0)
 560             range_max <- bfp_CONVERT_FROM_UI32(0xFFFF_FFFF)
 561             js_mask <- 0xFFFF_FFFF
 562         case(2):  # Signed 64-bit
 563             range_min <- bfp_CONVERT_FROM_SI64(-0x8000_0000_0000_0000)
 564             range_max <- bfp_CONVERT_FROM_SI64(0x7FFF_FFFF_FFFF_FFFF)
 565             js_mask <- 0xFFFF_FFFF_FFFF_FFFF
 566         default:  # Unsigned 64-bit
 567             range_min <- bfp_CONVERT_FROM_UI64(0)
 568             range_max <- bfp_CONVERT_FROM_UI64(0xFFFF_FFFF_FFFF_FFFF)
 569             js_mask <- 0xFFFF_FFFF_FFFF_FFFF
 570
 571     if CVM[2] = 1 or FPSCR.RN = 0b01 then
 572         rnd <- bfp_ROUND_TO_INTEGER_TRUNC(src)
 573     else if FPSCR.RN = 0b00 then
 574         rnd <- bfp_ROUND_TO_INTEGER_NEAR_EVEN(src)
 575     else if FPSCR.RN = 0b10 then
 576         rnd <- bfp_ROUND_TO_INTEGER_CEIL(src)
 577     else if FPSCR.RN = 0b11 then
 578         rnd <- bfp_ROUND_TO_INTEGER_FLOOR(src)
 579
 580     switch(CVM)
 581         case(0, 1):  # OpenPower semantics
 582             if IsNaN(rnd) then
 583                 result <- si64_CONVERT_FROM_BFP(range_min)
 584             else if bfp_COMPARE_GT(rnd, range_max) then
 585                 result <- ui64_CONVERT_FROM_BFP(range_max)
 586             else if bfp_COMPARE_LT(rnd, range_min) then
 587                 result <- si64_CONVERT_FROM_BFP(range_min)
 588             else if IT[1] = 1 then  # Unsigned 32/64-bit
 589                 result <- ui64_CONVERT_FROM_BFP(range_max)
 590             else  # Signed 32/64-bit
 591                 result <- si64_CONVERT_FROM_BFP(range_max)
 592         case(2, 3):  # Java/Saturating semantics
 593             if IsNaN(rnd) then
 594                 result <- [0] * 64
 595             else if bfp_COMPARE_GT(rnd, range_max) then
 596                 result <- ui64_CONVERT_FROM_BFP(range_max)
 597             else if bfp_COMPARE_LT(rnd, range_min) then
 598                 result <- si64_CONVERT_FROM_BFP(range_min)
 599             else if IT[1] = 1 then  # Unsigned 32/64-bit
 600                 result <- ui64_CONVERT_FROM_BFP(range_max)
 601             else  # Signed 32/64-bit
 602                 result <- si64_CONVERT_FROM_BFP(range_max)
 603         default:  # JavaScript semantics
 604             # CVM = 6, 7 are illegal instructions
 605             # this works because the largest type we try to convert from has
 606             # 53 significand bits, and the largest type we try to convert to
 607             # has 64 bits, and the sum of those is strictly less than the 128
 608             # bits of the intermediate result.
 609             limit <- bfp_CONVERT_FROM_UI128([1] * 128)
 610             if IsInf(rnd) or IsNaN(rnd) then
 611                 result <- [0] * 64
 612             else if bfp_COMPARE_GT(bfp_ABSOLUTE(rnd), limit) then
 613                 result <- [0] * 64
 614             else
 615                 result128 <- si128_CONVERT_FROM_BFP(rnd)
 616                 result <- result128[64:127] & js_mask
 617
 618     switch(IT)
 619         case(0):  # Signed 32-bit
 620             result <- EXTS64(result[32:63])
 621             result_bfp <- bfp_CONVERT_FROM_SI32(result[32:63])
 622         case(1):  # Unsigned 32-bit
 623             result <- EXTZ64(result[32:63])
 624             result_bfp <- bfp_CONVERT_FROM_UI32(result[32:63])
 625         case(2):  # Signed 64-bit
 626             result_bfp <- bfp_CONVERT_FROM_SI64(result)
 627         default:  # Unsigned 64-bit
 628             result_bfp <- bfp_CONVERT_FROM_UI64(result)
 629
 630     if vxsnan_flag = 1 then SetFX(FPSCR.VXSNAN)
 631     if vxcvi_flag = 1 then SetFX(FPSCR.VXCVI)
 632     if xx_flag = 1 then SetFX(FPSCR.XX)
 633
 634     vx_flag <- vxsnan_flag | vxcvi_flag
 635     vex_flag <- FPSCR.VE & vx_flag
 636
 637     if vex_flag = 0 then
 638         RT <- result
 639         FPSCR.FPRF <- undefined
 640         FPSCR.FR <- inc_flag
 641         FPSCR.FI <- xx_flag
 642         if IsNaN(src) or not bfp_COMPARE_EQ(src, result_bfp) then
 643             overflow <- 1  # signals SO only when OE = 1
 644     else
 645         FPSCR.FR <- 0
 646         FPSCR.FI <- 0
 647 ```
 648
 649 Convert from 64-bit float in FRB to a unsigned/signed 32/64-bit integer
 650 in RT, with the conversion overflow/rounding semantics following the
 651 chosen `CVM` value. `FPSCR` is modified and exceptions are raised as usual.
 652
 653 These instructions have an Rc=1 mode which sets CR0 in the normal
 654 way for any instructions producing a GPR result.  Additionally, when OE=1,
 655 if the numerical value of the FP number is not 100% accurately preserved
 656 (due to truncation or saturation and including when the FP number was
 657 NaN) then this is considered to be an Integer Overflow condition, and
 658 CR0.SO, XER.SO and XER.OV are all set as normal for any GPR instructions
 659 that overflow.
 660
 661 Special Registers altered:
 662
 663     CR0              (if Rc=1)
 664     XER SO, OV, OV32 (if OE=1)
 665     FPCSR   (TODO: which bits?)
 666
 667 ### Assembly Aliases
 668
 669 | Assembly Alias            | Full Instruction           | Assembly Alias            | Full Instruction           |
 670 |---------------------------|----------------------------|---------------------------|----------------------------|
 671 | `fcvttgw RT, FRB, CVM`    | `fcvttg RT, FRB, CVM, 0`   | `fcvttgd RT, FRB, CVM`    | `fcvttg RT, FRB, CVM, 2`   |
 672 | `fcvttgw. RT, FRB, CVM`   | `fcvttg. RT, FRB, CVM, 0`  | `fcvttgd. RT, FRB, CVM`   | `fcvttg. RT, FRB, CVM, 2`  |
 673 | `fcvttgwo RT, FRB, CVM`   | `fcvttgo RT, FRB, CVM, 0`  | `fcvttgdo RT, FRB, CVM`   | `fcvttgo RT, FRB, CVM, 2`  |
 674 | `fcvttgwo. RT, FRB, CVM`  | `fcvttgo. RT, FRB, CVM, 0` | `fcvttgdo. RT, FRB, CVM`  | `fcvttgo. RT, FRB, CVM, 2` |
 675 | `fcvttguw RT, FRB, CVM`   | `fcvttg RT, FRB, CVM, 1`   | `fcvttgud RT, FRB, CVM`   | `fcvttg RT, FRB, CVM, 3`   |
 676 | `fcvttguw. RT, FRB, CVM`  | `fcvttg. RT, FRB, CVM, 1`  | `fcvttgud. RT, FRB, CVM`  | `fcvttg. RT, FRB, CVM, 3`  |
 677 | `fcvttguwo RT, FRB, CVM`  | `fcvttgo RT, FRB, CVM, 1`  | `fcvttgudo RT, FRB, CVM`  | `fcvttgo RT, FRB, CVM, 3`  |
 678 | `fcvttguwo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 1` | `fcvttgudo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 3` |
 679
 680 ----------
 681
 682 \newpage{}
 683
 684 ## Floating Convert Single To Integer In GPR
 685
 686 ```
 687     fcvtstg RT, FRB, CVM, IT
 688     fcvtstg. RT, FRB, CVM, IT
 689     fcvtstgo RT, FRB, CVM, IT
 690     fcvtstgo. RT, FRB, CVM, IT
 691 ```
 692
 693 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-29 | 30 | 31 | Form    |
 694 |-----|------|-------|-------|-------|-------|----|----|---------|
 695 | PO  | RT   | IT    | CVM   | FRB   | XO    | OE | Rc | XO-Form |
 696
 697 ```
 698     # based on xscvdpuxws
 699     reset_xflags()
 700     src <- bfp_CONVERT_FROM_BFP32(SINGLE((FRB)))
 701
 702     switch(IT)
 703         case(0):  # Signed 32-bit
 704             range_min <- bfp_CONVERT_FROM_SI32(0x8000_0000)
 705             range_max <- bfp_CONVERT_FROM_SI32(0x7FFF_FFFF)
 706             js_mask <- 0xFFFF_FFFF
 707         case(1):  # Unsigned 32-bit
 708             range_min <- bfp_CONVERT_FROM_UI32(0)
 709             range_max <- bfp_CONVERT_FROM_UI32(0xFFFF_FFFF)
 710             js_mask <- 0xFFFF_FFFF
 711         case(2):  # Signed 64-bit
 712             range_min <- bfp_CONVERT_FROM_SI64(-0x8000_0000_0000_0000)
 713             range_max <- bfp_CONVERT_FROM_SI64(0x7FFF_FFFF_FFFF_FFFF)
 714             js_mask <- 0xFFFF_FFFF_FFFF_FFFF
 715         default:  # Unsigned 64-bit
 716             range_min <- bfp_CONVERT_FROM_UI64(0)
 717             range_max <- bfp_CONVERT_FROM_UI64(0xFFFF_FFFF_FFFF_FFFF)
 718             js_mask <- 0xFFFF_FFFF_FFFF_FFFF
 719
 720     if CVM[2] = 1 or FPSCR.RN = 0b01 then
 721         rnd <- bfp_ROUND_TO_INTEGER_TRUNC(src)
 722     else if FPSCR.RN = 0b00 then
 723         rnd <- bfp_ROUND_TO_INTEGER_NEAR_EVEN(src)
 724     else if FPSCR.RN = 0b10 then
 725         rnd <- bfp_ROUND_TO_INTEGER_CEIL(src)
 726     else if FPSCR.RN = 0b11 then
 727         rnd <- bfp_ROUND_TO_INTEGER_FLOOR(src)
 728
 729     switch(CVM)
 730         case(0, 1):  # OpenPower semantics
 731             if IsNaN(rnd) then
 732                 result <- si64_CONVERT_FROM_BFP(range_min)
 733             else if bfp_COMPARE_GT(rnd, range_max) then
 734                 result <- ui64_CONVERT_FROM_BFP(range_max)
 735             else if bfp_COMPARE_LT(rnd, range_min) then
 736                 result <- si64_CONVERT_FROM_BFP(range_min)
 737             else if IT[1] = 1 then  # Unsigned 32/64-bit
 738                 result <- ui64_CONVERT_FROM_BFP(range_max)
 739             else  # Signed 32/64-bit
 740                 result <- si64_CONVERT_FROM_BFP(range_max)
 741         case(2, 3):  # Java/Saturating semantics
 742             if IsNaN(rnd) then
 743                 result <- [0] * 64
 744             else if bfp_COMPARE_GT(rnd, range_max) then
 745                 result <- ui64_CONVERT_FROM_BFP(range_max)
 746             else if bfp_COMPARE_LT(rnd, range_min) then
 747                 result <- si64_CONVERT_FROM_BFP(range_min)
 748             else if IT[1] = 1 then  # Unsigned 32/64-bit
 749                 result <- ui64_CONVERT_FROM_BFP(range_max)
 750             else  # Signed 32/64-bit
 751                 result <- si64_CONVERT_FROM_BFP(range_max)
 752         default:  # JavaScript semantics
 753             # CVM = 6, 7 are illegal instructions
 754             # this works because the largest type we try to convert from has
 755             # 53 significand bits, and the largest type we try to convert to
 756             # has 64 bits, and the sum of those is strictly less than the 128
 757             # bits of the intermediate result.
 758             limit <- bfp_CONVERT_FROM_UI128([1] * 128)
 759             if IsInf(rnd) or IsNaN(rnd) then
 760                 result <- [0] * 64
 761             else if bfp_COMPARE_GT(bfp_ABSOLUTE(rnd), limit) then
 762                 result <- [0] * 64
 763             else
 764                 result128 <- si128_CONVERT_FROM_BFP(rnd)
 765                 result <- result128[64:127] & js_mask
 766
 767     switch(IT)
 768         case(0):  # Signed 32-bit
 769             result <- EXTS64(result[32:63])
 770             result_bfp <- bfp_CONVERT_FROM_SI32(result[32:63])
 771         case(1):  # Unsigned 32-bit
 772             result <- EXTZ64(result[32:63])
 773             result_bfp <- bfp_CONVERT_FROM_UI32(result[32:63])
 774         case(2):  # Signed 64-bit
 775             result_bfp <- bfp_CONVERT_FROM_SI64(result)
 776         default:  # Unsigned 64-bit
 777             result_bfp <- bfp_CONVERT_FROM_UI64(result)
 778
 779     if vxsnan_flag = 1 then SetFX(FPSCR.VXSNAN)
 780     if vxcvi_flag = 1 then SetFX(FPSCR.VXCVI)
 781     if xx_flag = 1 then SetFX(FPSCR.XX)
 782
 783     vx_flag <- vxsnan_flag | vxcvi_flag
 784     vex_flag <- FPSCR.VE & vx_flag
 785
 786     if vex_flag = 0 then
 787         RT <- result
 788         FPSCR.FPRF <- undefined
 789         FPSCR.FR <- inc_flag
 790         FPSCR.FI <- xx_flag
 791         if IsNaN(src) or not bfp_COMPARE_EQ(src, result_bfp) then
 792             overflow <- 1  # signals SO only when OE = 1
 793     else
 794         FPSCR.FR <- 0
 795         FPSCR.FI <- 0
 796 ```
 797
 798 Convert from 32-bit float in FRB to a unsigned/signed 32/64-bit integer
 799 in RT, with the conversion overflow/rounding semantics following the
 800 chosen `CVM` value, following the usual 32-bit float in 64-bit float
 801 format. `FPSCR` is modified and exceptions are raised as usual.
 802
 803 These instructions have an Rc=1 mode which sets CR0 in the normal
 804 way for any instructions producing a GPR result.  Additionally, when OE=1,
 805 if the numerical value of the FP number is not 100% accurately preserved
 806 (due to truncation or saturation and including when the FP number was
 807 NaN) then this is considered to be an Integer Overflow condition, and
 808 CR0.SO, XER.SO and XER.OV are all set as normal for any GPR instructions
 809 that overflow.
 810
 811 Special Registers altered:
 812
 813     CR0              (if Rc=1)
 814     XER SO, OV, OV32 (if OE=1)
 815     FPCSR   (TODO: which bits?)
 816
 817 ### Assembly Aliases
 818
 819 | Assembly Alias             | Full Instruction            | Assembly Alias             | Full Instruction            |
 820 |----------------------------|-----------------------------|----------------------------|-----------------------------|
 821 | `fcvtstgw RT, FRB, CVM`    | `fcvtstg RT, FRB, CVM, 0`   | `fcvtstgd RT, FRB, CVM`    | `fcvtstg RT, FRB, CVM, 2`   |
 822 | `fcvtstgw. RT, FRB, CVM`   | `fcvtstg. RT, FRB, CVM, 0`  | `fcvtstgd. RT, FRB, CVM`   | `fcvtstg. RT, FRB, CVM, 2`  |
 823 | `fcvtstgwo RT, FRB, CVM`   | `fcvtstgo RT, FRB, CVM, 0`  | `fcvtstgdo RT, FRB, CVM`   | `fcvtstgo RT, FRB, CVM, 2`  |
 824 | `fcvtstgwo. RT, FRB, CVM`  | `fcvtstgo. RT, FRB, CVM, 0` | `fcvtstgdo. RT, FRB, CVM`  | `fcvtstgo. RT, FRB, CVM, 2` |
 825 | `fcvtstguw RT, FRB, CVM`   | `fcvtstg RT, FRB, CVM, 1`   | `fcvtstgud RT, FRB, CVM`   | `fcvtstg RT, FRB, CVM, 3`   |
 826 | `fcvtstguw. RT, FRB, CVM`  | `fcvtstg. RT, FRB, CVM, 1`  | `fcvtstgud. RT, FRB, CVM`  | `fcvtstg. RT, FRB, CVM, 3`  |
 827 | `fcvtstguwo RT, FRB, CVM`  | `fcvtstgo RT, FRB, CVM, 1`  | `fcvtstgudo RT, FRB, CVM`  | `fcvtstgo RT, FRB, CVM, 3`  |
 828 | `fcvtstguwo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 1` | `fcvtstgudo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 3` |
 829
 830 ----------
 831
 832 \newpage{}
 833
 834 ----------
 835
 836 # Appendices
 837
 838     Appendix E Power ISA sorted by opcode
 839     Appendix F Power ISA sorted by version
 840     Appendix G Power ISA sorted by Compliancy Subset
 841     Appendix H Power ISA sorted by mnemonic
 842
 843 |Form| Book | Page | Version | mnemonic | Description |
 844 |----|------|------|---------|----------|-------------|
 845 |VA  | I    | #    | 3.2B    |todo   | |
 846
 847 ----------------
 848
 849 [[!tag opf_rfc]]