openpower/sv/rfc/ls006.mdwn

   1 # RFC ls006 FPR <-> GPR Move/Conversion
   2
   3 **URLs**:
   4
   5 * <https://libre-soc.org/openpower/sv/int_fp_mv/>
   6 * <https://libre-soc.org/openpower/sv/rfc/ls006/>
   7 * <https://bugs.libre-soc.org/show_bug.cgi?id=1015>
   8 * <https://git.openpower.foundation/isa/PowerISA/issues/todo>
   9
  10 **Severity**: Major
  11
  12 **Status**: New
  13
  14 **Date**: 20 Oct 2022
  15
  16 **Target**: v3.2B
  17
  18 **Source**: v3.1B
  19
  20 **Books and Section affected**: **UPDATE**
  21
  22 * Book I 4.6.5 Floating-Point Move Instructions
  23 * Book I 4.6.7.2 Floating-Point Convert To/From Integer Instructions
  24 * Appendix E Power ISA sorted by opcode
  25 * Appendix F Power ISA sorted by version
  26 * Appendix G Power ISA sorted by Compliancy Subset
  27 * Appendix H Power ISA sorted by mnemonic
  28
  29 **Summary**
  30
  31 Instructions added
  32
  33 * `fmvtg` -- Floating Move To GPR
  34 * `fmvfg` -- Floating Move From GPR
  35 * `fcvttg` -- Floating Convert To Integer In GPR
  36 * `fcvtfg` -- Floating Convert From Integer In GPR
  37
  38 **Submitter**: Luke Leighton (Libre-SOC)
  39
  40 **Requester**: Libre-SOC
  41
  42 **Impact on processor**:
  43
  44 * Addition of five new GPR-FPR-based instructions
  45
  46 **Impact on software**:
  47
  48 * Requires support for new instructions in assembler, debuggers,
  49   and related tools.
  50
  51 **Keywords**:
  52
  53 ```
  54     GPR, FPR, Move, Conversion, JavaScript
  55 ```
  56
  57 **Motivation**
  58
  59 CPUs without VSX/VMX lack a way to efficiently transfer data between
  60 FPRs and GPRs, they need to go through memory, this proposal adds more
  61 efficient data transfer (both bitwise copy and Integer <-> FP conversion)
  62 instructions that transfer directly between FPRs and GPRs without needing
  63 to go through memory.
  64
  65 IEEE 754 doesn't specify what results are obtained when converting a NaN
  66 or out-of-range floating-point value to integer, so different programming
  67 languages and ISAs have made different choices.  Below is an overview
  68 of the different variants, listing the languages and hardware that
  69 implements each variant.
  70
  71 **Notes and Observations**:
  72
  73 * These instructions are present in many other ISAs.
  74 * JavaScript rounding as one instruction saves 35 instructions including
  75   six branches. (FIXME: disagrees with int_fp_mv and int_fp_mv/appendix)
  76
  77 **Changes**
  78
  79 Add the following entries to:
  80
  81 * Book I 4.6.5 Floating-Point Move Instructions
  82 * Book I 4.6.7.2 Floating-Point Convert To/From Integer Instructions
  83 * Book I 1.6.1 and 1.6.2
  84
  85 ----------------
  86
  87 \newpage{}
  88
  89 # Immediate Tables
  90
  91 Tables that are used by
  92 `fmvtg[s][.]`/`fmvfg[s][.]`/`fcvttg[s][.]`/`fcvtfg[s][.]`:
  93
  94 ## `RCS` -- `Rc` and `s`
  95
  96 | `RCS` | `Rc` | FP Single Mode | Assembly Alias Mnemonic |
  97 |-------|------|----------------|-------------------------|
  98 | 0     | 0    | Double         | `<op>`                  |
  99 | 1     | 1    | Double         | `<op>.`                 |
 100 | 2     | 0    | Single         | `<op>s`                 |
 101 | 3     | 1    | Single         | `<op>s.`                |
 102
 103 ## `IT` -- Integer Type
 104
 105 | `IT` | Integer Type    | Assembly Alias Mnemonic |
 106 |------|-----------------|-------------------------|
 107 | 0    | Signed 32-bit   | `<op>w`                 |
 108 | 1    | Unsigned 32-bit | `<op>uw`                |
 109 | 2    | Signed 64-bit   | `<op>d`                 |
 110 | 3    | Unsigned 64-bit | `<op>ud`                |
 111
 112 ## `CVM` -- Float to Integer Conversion Mode
 113
 114 | `CVM` | `rounding_mode` | Semantics                        |
 115 |-------|-----------------|----------------------------------|
 116 | 000   | from `FPSCR`    | [OpenPower semantics]            |
 117 | 001   | Truncate        | [OpenPower semantics]            |
 118 | 010   | from `FPSCR`    | [Java/Saturating semantics]      |
 119 | 011   | Truncate        | [Java/Saturating semantics]      |
 120 | 100   | from `FPSCR`    | [JavaScript semantics]           |
 121 | 101   | Truncate        | [JavaScript semantics]           |
 122 | rest  | --              | illegal instruction trap for now |
 123
 124 [OpenPower semantics]: #fp-to-int-openpower-conversion-semantics
 125 [Java/Saturating semantics]: #fp-to-int-java-saturating-conversion-semantics
 126 [JavaScript semantics]: #fp-to-int-javascript-conversion-semantics
 127
 128 ----------
 129
 130 \newpage{}
 131
 132 ## Floating Move To GPR
 133
 134 ```
 135     fmvtg RT, FRB
 136     fmvtg. RT, FRB
 137 ```
 138
 139 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form   |
 140 |-----|------|-------|-------|-------|----|--------|
 141 | PO  | RT   | 0     | FRB   | XO    | Rc | X-Form |
 142
 143 ```
 144     RT <- (FRB)
 145 ```
 146
 147 Move a 64-bit float from a FPR to a GPR, just copying bits of the IEEE 754
 148 representation directly. This is equivalent to `stfd` followed by `ld`.
 149 As `fmvtg` is just copying bits, `FPSCR` is not affected in any way.
 150
 151 Rc=1 tests RT and sets CR0, exactly like all other Scalar Fixed-Point
 152 operations.
 153
 154 Special Registers altered:
 155
 156     CR0     (if Rc=1)
 157
 158 ----------
 159
 160 \newpage{}
 161
 162 ## Floating Move To GPR Single
 163
 164 ```
 165     fmvtgs RT, FRB
 166     fmvtgs. RT, FRB
 167 ```
 168
 169 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form   |
 170 |-----|------|-------|-------|-------|----|--------|
 171 | PO  | RT   | 0     | FRB   | XO    | Rc | X-Form |
 172
 173 ```
 174     RT <- [0] * 32 || SINGLE((FRB))  # SINGLE since that's what stfs uses
 175 ```
 176
 177 Move a 32-bit float from a FPR to a GPR, just copying bits of the IEEE 754
 178 representation directly. This is equivalent to `stfs` followed by `lwz`.
 179 As `fmvtgs` is just copying bits, `FPSCR` is not affected in any way.
 180
 181 Rc=1 tests RT and sets CR0, exactly like all other Scalar Fixed-Point
 182 operations.
 183
 184 Special Registers altered:
 185
 186     CR0     (if Rc=1)
 187
 188 ----------
 189
 190 \newpage{}
 191
 192 ## Floating Move From GPR
 193
 194 ```
 195     fmvfg FRT, RB
 196     fmvfg. FRT, RB
 197 ```
 198
 199 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form   |
 200 |-----|------|-------|-------|-------|----|--------|
 201 | PO  | FRT  | 0     | RB    | XO    | Rc | X-Form |
 202
 203 ```
 204     FRT <- (RB)
 205 ```
 206
 207 move a 64-bit float from a GPR to a FPR, just copying bits of the IEEE 754
 208 representation directly. This is equivalent to `std` followed by `lfd`.
 209 As `fmvfg` is just copying bits, `FPSCR` is not affected in any way.
 210
 211 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
 212 operations.
 213
 214 Special Registers altered:
 215
 216     CR1     (if Rc=1)
 217
 218 ----------
 219
 220 \newpage{}
 221
 222 ## Floating Move From GPR Single
 223
 224 ```
 225     fmvfgs FRT, RB
 226     fmvfgs. FRT, RB
 227 ```
 228
 229 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form   |
 230 |-----|------|-------|-------|-------|----|--------|
 231 | PO  | FRT  | 0     | RB    | XO    | Rc | X-Form |
 232
 233 ```
 234     FRT <- DOUBLE((RB)[32:63])  # DOUBLE since that's what lfs uses
 235 ```
 236
 237 move a 32-bit float from a GPR to a FPR, just copying bits of the IEEE 754
 238 representation directly. This is equivalent to `stw` followed by `lfs`.
 239 As `fmvfgs` is just copying bits, `FPSCR` is not affected in any way.
 240
 241 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
 242 operations.
 243
 244 Special Registers altered:
 245
 246     CR1     (if Rc=1)
 247
 248 ----------
 249
 250 \newpage{}
 251
 252 ## Floating Convert From Integer In GPR
 253
 254 ```
 255     fcvtfg FRT, RB, IT
 256     fcvtfg. FRT, RB, IT
 257 ```
 258
 259 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-30 | 31 | Form   |
 260 |-----|------|-------|-------|-------|-------|----|--------|
 261 | PO  | FRT  | IT    | 0     | RB    | XO    | Rc | X-Form |
 262
 263 ```
 264     if IT[0] = 0 then  # 32-bit int -> 64-bit float
 265         # rounding never necessary, so don't touch FPSCR
 266         # based off xvcvsxwdp
 267         if IT = 0 then  # Signed 32-bit
 268             src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
 269         else  # IT = 1 -- Unsigned 32-bit
 270             src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
 271         FRT <- bfp64_CONVERT_FROM_BFP(src)
 272     else
 273         # rounding may be necessary. based off xscvuxdsp
 274         reset_xflags()
 275         switch(IT)
 276             case(0):  # Signed 32-bit
 277                 src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
 278             case(1):  # Unsigned 32-bit
 279                 src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
 280             case(2):  # Signed 64-bit
 281                 src <- bfp_CONVERT_FROM_SI64((RB))
 282             default:  # Unsigned 64-bit
 283                 src <- bfp_CONVERT_FROM_UI64((RB))
 284         rnd <- bfp_ROUND_TO_BFP64(FPSCR.RN, src)
 285         result <- bfp64_CONVERT_FROM_BFP(rnd)
 286         cls <- fprf_CLASS_BFP64(result)
 287
 288         if xx_flag = 1 then SetFX(FPSCR.XX)
 289
 290         FRT <- result
 291         FPSCR.FPRF <- cls
 292         FPSCR.FR <- inc_flag
 293         FPSCR.FI <- xx_flag
 294 ```
 295 <!-- note the PowerISA spec. explicitly has empty lines before/after SetFX,
 296 don't remove them -->
 297
 298 Convert from a unsigned/signed 32/64-bit integer in RB to a 64-bit
 299 float in FRT.
 300
 301 If converting from a unsigned/signed 32-bit integer to a 64-bit float,
 302 rounding is never necessary, so `FPSCR` is unmodified and exceptions are
 303 never raised. Otherwise, `FPSCR` is modified and exceptions are raised
 304 as usual.
 305
 306 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
 307 operations.
 308
 309 Special Registers altered:
 310
 311     CR1     (if Rc=1)
 312     FPCSR   (TODO: which bits?) (if IT[0]=1)
 313
 314 ### Assembly Aliases
 315
 316 | Assembly Alias       | Full Instruction     |&nbsp;| Assembly Alias       | Full Instruction     |
 317 |----------------------|----------------------|------|----------------------|----------------------|
 318 | `fcvtfgw FRT, RB`    | `fcvtfg FRT, RB, 0`  |&nbsp;| `fcvtfgd FRT, RB`    | `fcvtfg FRT, RB, 2`  |
 319 | `fcvtfgw. FRT, RB`   | `fcvtfg. FRT, RB, 0` |&nbsp;| `fcvtfgd. FRT, RB`   | `fcvtfg. FRT, RB, 2` |
 320 | `fcvtfguw FRT, RB`   | `fcvtfg FRT, RB, 1`  |&nbsp;| `fcvtfgud FRT, RB`   | `fcvtfg FRT, RB, 3`  |
 321 | `fcvtfguw. FRT, RB`  | `fcvtfg. FRT, RB, 1` |&nbsp;| `fcvtfgud. FRT, RB`  | `fcvtfg. FRT, RB, 3` |
 322
 323 ----------
 324
 325 \newpage{}
 326
 327 ## Floating Convert From Integer In GPR Single
 328
 329 ```
 330     fcvtfgs FRT, RB, IT
 331     fcvtfgs. FRT, RB, IT
 332 ```
 333
 334 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-30 | 31 | Form   |
 335 |-----|------|-------|-------|-------|-------|----|--------|
 336 | PO  | FRT  | IT    | 0     | RB    | XO    | Rc | X-Form |
 337
 338 ```
 339     # rounding may be necessary. based off xscvuxdsp
 340     reset_xflags()
 341     switch(IT)
 342         case(0):  # Signed 32-bit
 343             src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
 344         case(1):  # Unsigned 32-bit
 345             src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
 346         case(2):  # Signed 64-bit
 347             src <- bfp_CONVERT_FROM_SI64((RB))
 348         default:  # Unsigned 64-bit
 349             src <- bfp_CONVERT_FROM_UI64((RB))
 350     rnd <- bfp_ROUND_TO_BFP32(FPSCR.RN, src)
 351     result32 <- bfp32_CONVERT_FROM_BFP(rnd)
 352     cls <- fprf_CLASS_BFP32(result32)
 353     result <- DOUBLE(result32)
 354
 355     if xx_flag = 1 then SetFX(FPSCR.XX)
 356
 357     FRT <- result
 358     FPSCR.FPRF <- cls
 359     FPSCR.FR <- inc_flag
 360     FPSCR.FI <- xx_flag
 361 ```
 362 <!-- note the PowerISA spec. explicitly has empty lines before/after SetFX,
 363 don't remove them -->
 364
 365 Convert from a unsigned/signed 32/64-bit integer in RB to a 32-bit
 366 float in FRT, following the usual 32-bit float in 64-bit float format.
 367 `FPSCR` is modified and exceptions are raised as usual.
 368
 369 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
 370 operations.
 371
 372 Special Registers altered:
 373
 374     CR1     (if Rc=1)
 375     FPCSR   (TODO: which bits?)
 376
 377 ### Assembly Aliases
 378
 379 | Assembly Alias       | Full Instruction     |&nbsp;| Assembly Alias       | Full Instruction     |
 380 |----------------------|----------------------|------|----------------------|----------------------|
 381 | `fcvtfgws FRT, RB`   | `fcvtfg FRT, RB, 0`  |&nbsp;| `fcvtfgds FRT, RB`   | `fcvtfg FRT, RB, 2`  |
 382 | `fcvtfgws. FRT, RB`  | `fcvtfg. FRT, RB, 0` |&nbsp;| `fcvtfgds. FRT, RB`  | `fcvtfg. FRT, RB, 2` |
 383 | `fcvtfguws FRT, RB`  | `fcvtfg FRT, RB, 1`  |&nbsp;| `fcvtfguds FRT, RB`  | `fcvtfg FRT, RB, 3`  |
 384 | `fcvtfguws. FRT, RB` | `fcvtfg. FRT, RB, 1` |&nbsp;| `fcvtfguds. FRT, RB` | `fcvtfg. FRT, RB, 3` |
 385
 386 ----------
 387
 388 \newpage{}
 389
 390 ## Floating-point to Integer Conversion Overview
 391
 392 <div id="fpr-to-gpr-conversion-mode"></div>
 393
 394 IEEE 754 doesn't specify what results are obtained when converting a NaN
 395 or out-of-range floating-point value to integer, so different programming
 396 languages and ISAs have made different choices.  Below is an overview
 397 of the different variants, listing the languages and hardware that
 398 implements each variant.
 399
 400 For convenience, we will give those different conversion semantics names
 401 based on which common ISA or programming language uses them, since there
 402 may not be an established name for them:
 403
 404 **Standard OpenPower conversion**
 405
 406 This conversion performs "saturation with NaN converted to minimum
 407 valid integer". This is also exactly the same as the x86 ISA conversion
 408 semantics.  OpenPOWER however has instructions for both:
 409
 410 * rounding mode read from FPSCR
 411 * rounding mode always set to truncate
 412
 413 **Java/Saturating conversion**
 414
 415 For the sake of simplicity, the FP -> Integer conversion semantics
 416 generalized from those used by Java's semantics (and Rust's `as`
 417 operator) will be referred to as [Java/Saturating conversion
 418 semantics](#fp-to-int-java-saturating-conversion-semantics).
 419
 420 Those same semantics are used in some way by all of the following
 421 languages (not necessarily for the default conversion method):
 422
 423 * Java's
 424   [FP -> Integer conversion](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
 425   (only for long/int results)
 426 * Rust's FP -> Integer conversion using the
 427   [`as` operator](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
 428 * LLVM's
 429   [`llvm.fptosi.sat`](https://llvm.org/docs/LangRef.html#llvm-fptosi-sat-intrinsic) and
 430   [`llvm.fptoui.sat`](https://llvm.org/docs/LangRef.html#llvm-fptoui-sat-intrinsic) intrinsics
 431 * SPIR-V's OpenCL dialect's
 432   [`OpConvertFToU`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToU) and
 433   [`OpConvertFToS`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToS)
 434   instructions when decorated with
 435   [the `SaturatedConversion` decorator](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#_a_id_decoration_a_decoration).
 436 * WebAssembly has also introduced
 437  [trunc_sat_u](ttps://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-u) and
 438  [trunc_sat_s](https://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-s)
 439
 440 **JavaScript conversion**
 441
 442 For the sake of simplicity, the FP -> Integer conversion
 443 semantics generalized from those used by JavaScripts's `ToInt32`
 444 abstract operation will be referred to as [JavaScript conversion
 445 semantics](#fp-to-int-javascript-conversion-semantics).
 446
 447 This instruction is present in ARM assembler as FJCVTZS
 448 <https://developer.arm.com/documentation/dui0801/g/hko1477562192868>
 449
 450 **Rc=1 and OE=1**
 451
 452 All of these instructions have an Rc=1 mode which sets CR0
 453 in the normal way for any instructions producing a GPR result.
 454 Additionally, when OE=1, if the numerical value of the FP number
 455 is not 100% accurately preserved (due to truncation or saturation
 456 and including when the FP number was NaN) then this is considered
 457 to be an integer Overflow condition, and CR0.SO, XER.SO and XER.OV
 458 are all set as normal for any GPR instructions that overflow.
 459
 460 \newpage{}
 461
 462 ### FP to Integer Conversion Simplified Pseudo-code
 463
 464 Key for pseudo-code:
 465
 466 | term                      | result type | definition                                                                                         |
 467 |---------------------------|-------------|----------------------------------------------------------------------------------------------------|
 468 | `fp`                      | --          | `f32` or `f64` (or other types from SimpleV)                                                       |
 469 | `int`                     | --          | `u32`/`u64`/`i32`/`i64` (or other types from SimpleV)                                              |
 470 | `uint`                    | --          | the unsigned integer of the same bit-width as `int`                                                |
 471 | `int::BITS`               | `int`       | the bit-width of `int`                                                                             |
 472 | `uint::MIN_VALUE`         | `uint`      | the minimum value `uint` can store: `0`                   |
 473 | `uint::MAX_VALUE`          | `uint`       | the maximum value `uint` can store: `2^int::BITS - 1`  |
 474 | `int::MIN_VALUE`          | `int`       | the minimum value `int` can store : `-2^(int::BITS-1)`              |
 475 | `int::MAX_VALUE`          | `int`       | the maximum value `int` can store :  `2^(int::BITS-1) - 1`  |
 476 | `int::VALUE_COUNT`        | Integer     | the number of different values `int` can store (`2^int::BITS`). too big to fit in `int`.           |
 477 | `rint(fp, rounding_mode)` | `fp`        | rounds the floating-point value `fp` to an integer according to rounding mode `rounding_mode`      |
 478
 479 <div id="fp-to-int-openpower-conversion-semantics"></div>
 480 OpenPower conversion semantics (section A.2 page 1009 (page 1035) of
 481 Power ISA v3.1B):
 482
 483 ```
 484     def fp_to_int_open_power<fp, int>(v: fp) -> int:
 485         if v is NaN:
 486             return int::MIN_VALUE
 487         if v >= int::MAX_VALUE:
 488             return int::MAX_VALUE
 489         if v <= int::MIN_VALUE:
 490             return int::MIN_VALUE
 491         return (int)rint(v, rounding_mode)
 492 ```
 493
 494 <div id="fp-to-int-java-saturating-conversion-semantics"></div>
 495 [Java/Saturating conversion semantics](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
 496 (only for long/int results)/
 497 [Rust semantics](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
 498 (with adjustment to add non-truncate rounding modes):
 499
 500 ```
 501     def fp_to_int_java_saturating<fp, int>(v: fp) -> int:
 502         if v is NaN:
 503             return 0
 504         if v >= int::MAX_VALUE:
 505             return int::MAX_VALUE
 506         if v <= int::MIN_VALUE:
 507             return int::MIN_VALUE
 508         return (int)rint(v, rounding_mode)
 509 ```
 510
 511 <div id="fp-to-int-javascript-conversion-semantics"></div>
 512 Section 7.1 of the ECMAScript / JavaScript
 513 [conversion semantics](https://262.ecma-international.org/11.0/#sec-toint32)
 514 (with adjustment to add non-truncate rounding modes):
 515
 516 ```
 517     def fp_to_int_java_script<fp, int>(v: fp) -> int:
 518         if v is NaN or infinite:
 519             return 0
 520         v = rint(v, rounding_mode)  # assume no loss of precision in result
 521         v = v mod int::VALUE_COUNT  # 2^32 for i32, 2^64 for i64, result is non-negative
 522         bits = (uint)v
 523         return (int)bits
 524 ```
 525
 526
 527 ----------
 528
 529 \newpage{}
 530
 531
 532 ## Floating Convert To Integer In GPR
 533
 534 ```
 535     fcvttg RT, FRB, CVM, IT
 536     fcvttg. RT, FRB, CVM, IT
 537     fcvttgo RT, FRB, CVM, IT
 538     fcvttgo. RT, FRB, CVM, IT
 539 ```
 540
 541 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-29 | 30 | 31 | Form    |
 542 |-----|------|-------|-------|-------|-------|----|----|---------|
 543 | PO  | RT   | IT    | CVM   | FRB   | XO    | OE | Rc | XO-Form |
 544
 545 ```
 546     # based on xscvdpuxws
 547     reset_xflags()
 548     src <- bfp_CONVERT_FROM_BFP64((FRB))
 549
 550     switch(IT)
 551         case(0):  # Signed 32-bit
 552             range_min <- bfp_CONVERT_FROM_SI32(0x8000_0000)
 553             range_max <- bfp_CONVERT_FROM_SI32(0x7FFF_FFFF)
 554             js_mask <- 0xFFFF_FFFF
 555         case(1):  # Unsigned 32-bit
 556             range_min <- bfp_CONVERT_FROM_UI32(0)
 557             range_max <- bfp_CONVERT_FROM_UI32(0xFFFF_FFFF)
 558             js_mask <- 0xFFFF_FFFF
 559         case(2):  # Signed 64-bit
 560             range_min <- bfp_CONVERT_FROM_SI64(-0x8000_0000_0000_0000)
 561             range_max <- bfp_CONVERT_FROM_SI64(0x7FFF_FFFF_FFFF_FFFF)
 562             js_mask <- 0xFFFF_FFFF_FFFF_FFFF
 563         default:  # Unsigned 64-bit
 564             range_min <- bfp_CONVERT_FROM_UI64(0)
 565             range_max <- bfp_CONVERT_FROM_UI64(0xFFFF_FFFF_FFFF_FFFF)
 566             js_mask <- 0xFFFF_FFFF_FFFF_FFFF
 567
 568     if CVM[2] = 1 or FPSCR.RN = 0b01 then
 569         rnd <- bfp_ROUND_TO_INTEGER_TRUNC(src)
 570     else if FPSCR.RN = 0b00 then
 571         rnd <- bfp_ROUND_TO_INTEGER_NEAR_EVEN(src)
 572     else if FPSCR.RN = 0b10 then
 573         rnd <- bfp_ROUND_TO_INTEGER_CEIL(src)
 574     else if FPSCR.RN = 0b11 then
 575         rnd <- bfp_ROUND_TO_INTEGER_FLOOR(src)
 576
 577     switch(CVM)
 578         case(0, 1):  # OpenPower semantics
 579             if IsNaN(rnd) then
 580                 result <- si64_CONVERT_FROM_BFP(range_min)
 581             else if bfp_COMPARE_GT(rnd, range_max) then
 582                 result <- ui64_CONVERT_FROM_BFP(range_max)
 583             else if bfp_COMPARE_LT(rnd, range_min) then
 584                 result <- si64_CONVERT_FROM_BFP(range_min)
 585             else if IT[1] = 1 then  # Unsigned 32/64-bit
 586                 result <- ui64_CONVERT_FROM_BFP(range_max)
 587             else  # Signed 32/64-bit
 588                 result <- si64_CONVERT_FROM_BFP(range_max)
 589         case(2, 3):  # Java/Saturating semantics
 590             if IsNaN(rnd) then
 591                 result <- [0] * 64
 592             else if bfp_COMPARE_GT(rnd, range_max) then
 593                 result <- ui64_CONVERT_FROM_BFP(range_max)
 594             else if bfp_COMPARE_LT(rnd, range_min) then
 595                 result <- si64_CONVERT_FROM_BFP(range_min)
 596             else if IT[1] = 1 then  # Unsigned 32/64-bit
 597                 result <- ui64_CONVERT_FROM_BFP(range_max)
 598             else  # Signed 32/64-bit
 599                 result <- si64_CONVERT_FROM_BFP(range_max)
 600         default:  # JavaScript semantics
 601             # CVM = 6, 7 are illegal instructions
 602             # this works because the largest type we try to convert from has
 603             # 53 significand bits, and the largest type we try to convert to
 604             # has 64 bits, and the sum of those is strictly less than the 128
 605             # bits of the intermediate result.
 606             limit <- bfp_CONVERT_FROM_UI128([1] * 128)
 607             if IsInf(rnd) or IsNaN(rnd) then
 608                 result <- [0] * 64
 609             else if bfp_COMPARE_GT(bfp_ABSOLUTE(rnd), limit) then
 610                 result <- [0] * 64
 611             else
 612                 result128 <- si128_CONVERT_FROM_BFP(rnd)
 613                 result <- result128[64:127] & js_mask
 614
 615     switch(IT)
 616         case(0):  # Signed 32-bit
 617             result <- EXTS64(result[32:63])
 618             result_bfp <- bfp_CONVERT_FROM_SI32(result[32:63])
 619         case(1):  # Unsigned 32-bit
 620             result <- EXTZ64(result[32:63])
 621             result_bfp <- bfp_CONVERT_FROM_UI32(result[32:63])
 622         case(2):  # Signed 64-bit
 623             result_bfp <- bfp_CONVERT_FROM_SI64(result)
 624         default:  # Unsigned 64-bit
 625             result_bfp <- bfp_CONVERT_FROM_UI64(result)
 626
 627     if vxsnan_flag = 1 then SetFX(FPSCR.VXSNAN)
 628     if vxcvi_flag = 1 then SetFX(FPSCR.VXCVI)
 629     if xx_flag = 1 then SetFX(FPSCR.XX)
 630
 631     vx_flag <- vxsnan_flag | vxcvi_flag
 632     vex_flag <- FPSCR.VE & vx_flag
 633
 634     if vex_flag = 0 then
 635         RT <- result
 636         FPSCR.FPRF <- undefined
 637         FPSCR.FR <- inc_flag
 638         FPSCR.FI <- xx_flag
 639         if IsNaN(src) or not bfp_COMPARE_EQ(src, result_bfp) then
 640             overflow <- 1  # signals SO only when OE = 1
 641     else
 642         FPSCR.FR <- 0
 643         FPSCR.FI <- 0
 644 ```
 645
 646 Convert from 64-bit float in FRB to a unsigned/signed 32/64-bit integer
 647 in RT, with the conversion overflow/rounding semantics following the
 648 chosen `CVM` value. `FPSCR` is modified and exceptions are raised as usual.
 649
 650 These instructions have an Rc=1 mode which sets CR0 in the normal
 651 way for any instructions producing a GPR result.  Additionally, when OE=1,
 652 if the numerical value of the FP number is not 100% accurately preserved
 653 (due to truncation or saturation and including when the FP number was
 654 NaN) then this is considered to be an Integer Overflow condition, and
 655 CR0.SO, XER.SO and XER.OV are all set as normal for any GPR instructions
 656 that overflow.
 657
 658 Special Registers altered:
 659
 660     CR0              (if Rc=1)
 661     XER SO, OV, OV32 (if OE=1)
 662     FPCSR   (TODO: which bits?)
 663
 664 ### Assembly Aliases
 665
 666 | Assembly Alias            | Full Instruction           | Assembly Alias            | Full Instruction           |
 667 |---------------------------|----------------------------|---------------------------|----------------------------|
 668 | `fcvttgw RT, FRB, CVM`    | `fcvttg RT, FRB, CVM, 0`   | `fcvttgd RT, FRB, CVM`    | `fcvttg RT, FRB, CVM, 2`   |
 669 | `fcvttgw. RT, FRB, CVM`   | `fcvttg. RT, FRB, CVM, 0`  | `fcvttgd. RT, FRB, CVM`   | `fcvttg. RT, FRB, CVM, 2`  |
 670 | `fcvttgwo RT, FRB, CVM`   | `fcvttgo RT, FRB, CVM, 0`  | `fcvttgdo RT, FRB, CVM`   | `fcvttgo RT, FRB, CVM, 2`  |
 671 | `fcvttgwo. RT, FRB, CVM`  | `fcvttgo. RT, FRB, CVM, 0` | `fcvttgdo. RT, FRB, CVM`  | `fcvttgo. RT, FRB, CVM, 2` |
 672 | `fcvttguw RT, FRB, CVM`   | `fcvttg RT, FRB, CVM, 1`   | `fcvttgud RT, FRB, CVM`   | `fcvttg RT, FRB, CVM, 3`   |
 673 | `fcvttguw. RT, FRB, CVM`  | `fcvttg. RT, FRB, CVM, 1`  | `fcvttgud. RT, FRB, CVM`  | `fcvttg. RT, FRB, CVM, 3`  |
 674 | `fcvttguwo RT, FRB, CVM`  | `fcvttgo RT, FRB, CVM, 1`  | `fcvttgudo RT, FRB, CVM`  | `fcvttgo RT, FRB, CVM, 3`  |
 675 | `fcvttguwo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 1` | `fcvttgudo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 3` |
 676
 677 ----------
 678
 679 \newpage{}
 680
 681 ## Floating Convert Single To Integer In GPR
 682
 683 ```
 684     fcvtstg RT, FRB, CVM, IT
 685     fcvtstg. RT, FRB, CVM, IT
 686     fcvtstgo RT, FRB, CVM, IT
 687     fcvtstgo. RT, FRB, CVM, IT
 688 ```
 689
 690 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-29 | 30 | 31 | Form    |
 691 |-----|------|-------|-------|-------|-------|----|----|---------|
 692 | PO  | RT   | IT    | CVM   | FRB   | XO    | OE | Rc | XO-Form |
 693
 694 ```
 695     # based on xscvdpuxws
 696     reset_xflags()
 697     src <- bfp_CONVERT_FROM_BFP32(SINGLE((FRB)))
 698
 699     switch(IT)
 700         case(0):  # Signed 32-bit
 701             range_min <- bfp_CONVERT_FROM_SI32(0x8000_0000)
 702             range_max <- bfp_CONVERT_FROM_SI32(0x7FFF_FFFF)
 703             js_mask <- 0xFFFF_FFFF
 704         case(1):  # Unsigned 32-bit
 705             range_min <- bfp_CONVERT_FROM_UI32(0)
 706             range_max <- bfp_CONVERT_FROM_UI32(0xFFFF_FFFF)
 707             js_mask <- 0xFFFF_FFFF
 708         case(2):  # Signed 64-bit
 709             range_min <- bfp_CONVERT_FROM_SI64(-0x8000_0000_0000_0000)
 710             range_max <- bfp_CONVERT_FROM_SI64(0x7FFF_FFFF_FFFF_FFFF)
 711             js_mask <- 0xFFFF_FFFF_FFFF_FFFF
 712         default:  # Unsigned 64-bit
 713             range_min <- bfp_CONVERT_FROM_UI64(0)
 714             range_max <- bfp_CONVERT_FROM_UI64(0xFFFF_FFFF_FFFF_FFFF)
 715             js_mask <- 0xFFFF_FFFF_FFFF_FFFF
 716
 717     if CVM[2] = 1 or FPSCR.RN = 0b01 then
 718         rnd <- bfp_ROUND_TO_INTEGER_TRUNC(src)
 719     else if FPSCR.RN = 0b00 then
 720         rnd <- bfp_ROUND_TO_INTEGER_NEAR_EVEN(src)
 721     else if FPSCR.RN = 0b10 then
 722         rnd <- bfp_ROUND_TO_INTEGER_CEIL(src)
 723     else if FPSCR.RN = 0b11 then
 724         rnd <- bfp_ROUND_TO_INTEGER_FLOOR(src)
 725
 726     switch(CVM)
 727         case(0, 1):  # OpenPower semantics
 728             if IsNaN(rnd) then
 729                 result <- si64_CONVERT_FROM_BFP(range_min)
 730             else if bfp_COMPARE_GT(rnd, range_max) then
 731                 result <- ui64_CONVERT_FROM_BFP(range_max)
 732             else if bfp_COMPARE_LT(rnd, range_min) then
 733                 result <- si64_CONVERT_FROM_BFP(range_min)
 734             else if IT[1] = 1 then  # Unsigned 32/64-bit
 735                 result <- ui64_CONVERT_FROM_BFP(range_max)
 736             else  # Signed 32/64-bit
 737                 result <- si64_CONVERT_FROM_BFP(range_max)
 738         case(2, 3):  # Java/Saturating semantics
 739             if IsNaN(rnd) then
 740                 result <- [0] * 64
 741             else if bfp_COMPARE_GT(rnd, range_max) then
 742                 result <- ui64_CONVERT_FROM_BFP(range_max)
 743             else if bfp_COMPARE_LT(rnd, range_min) then
 744                 result <- si64_CONVERT_FROM_BFP(range_min)
 745             else if IT[1] = 1 then  # Unsigned 32/64-bit
 746                 result <- ui64_CONVERT_FROM_BFP(range_max)
 747             else  # Signed 32/64-bit
 748                 result <- si64_CONVERT_FROM_BFP(range_max)
 749         default:  # JavaScript semantics
 750             # CVM = 6, 7 are illegal instructions
 751             # this works because the largest type we try to convert from has
 752             # 53 significand bits, and the largest type we try to convert to
 753             # has 64 bits, and the sum of those is strictly less than the 128
 754             # bits of the intermediate result.
 755             limit <- bfp_CONVERT_FROM_UI128([1] * 128)
 756             if IsInf(rnd) or IsNaN(rnd) then
 757                 result <- [0] * 64
 758             else if bfp_COMPARE_GT(bfp_ABSOLUTE(rnd), limit) then
 759                 result <- [0] * 64
 760             else
 761                 result128 <- si128_CONVERT_FROM_BFP(rnd)
 762                 result <- result128[64:127] & js_mask
 763
 764     switch(IT)
 765         case(0):  # Signed 32-bit
 766             result <- EXTS64(result[32:63])
 767             result_bfp <- bfp_CONVERT_FROM_SI32(result[32:63])
 768         case(1):  # Unsigned 32-bit
 769             result <- EXTZ64(result[32:63])
 770             result_bfp <- bfp_CONVERT_FROM_UI32(result[32:63])
 771         case(2):  # Signed 64-bit
 772             result_bfp <- bfp_CONVERT_FROM_SI64(result)
 773         default:  # Unsigned 64-bit
 774             result_bfp <- bfp_CONVERT_FROM_UI64(result)
 775
 776     if vxsnan_flag = 1 then SetFX(FPSCR.VXSNAN)
 777     if vxcvi_flag = 1 then SetFX(FPSCR.VXCVI)
 778     if xx_flag = 1 then SetFX(FPSCR.XX)
 779
 780     vx_flag <- vxsnan_flag | vxcvi_flag
 781     vex_flag <- FPSCR.VE & vx_flag
 782
 783     if vex_flag = 0 then
 784         RT <- result
 785         FPSCR.FPRF <- undefined
 786         FPSCR.FR <- inc_flag
 787         FPSCR.FI <- xx_flag
 788         if IsNaN(src) or not bfp_COMPARE_EQ(src, result_bfp) then
 789             overflow <- 1  # signals SO only when OE = 1
 790     else
 791         FPSCR.FR <- 0
 792         FPSCR.FI <- 0
 793 ```
 794
 795 Convert from 32-bit float in FRB to a unsigned/signed 32/64-bit integer
 796 in RT, with the conversion overflow/rounding semantics following the
 797 chosen `CVM` value, following the usual 32-bit float in 64-bit float
 798 format. `FPSCR` is modified and exceptions are raised as usual.
 799
 800 These instructions have an Rc=1 mode which sets CR0 in the normal
 801 way for any instructions producing a GPR result.  Additionally, when OE=1,
 802 if the numerical value of the FP number is not 100% accurately preserved
 803 (due to truncation or saturation and including when the FP number was
 804 NaN) then this is considered to be an Integer Overflow condition, and
 805 CR0.SO, XER.SO and XER.OV are all set as normal for any GPR instructions
 806 that overflow.
 807
 808 Special Registers altered:
 809
 810     CR0              (if Rc=1)
 811     XER SO, OV, OV32 (if OE=1)
 812     FPCSR   (TODO: which bits?)
 813
 814 ### Assembly Aliases
 815
 816 | Assembly Alias             | Full Instruction            | Assembly Alias             | Full Instruction            |
 817 |----------------------------|-----------------------------|----------------------------|-----------------------------|
 818 | `fcvtstgw RT, FRB, CVM`    | `fcvtstg RT, FRB, CVM, 0`   | `fcvtstgd RT, FRB, CVM`    | `fcvtstg RT, FRB, CVM, 2`   |
 819 | `fcvtstgw. RT, FRB, CVM`   | `fcvtstg. RT, FRB, CVM, 0`  | `fcvtstgd. RT, FRB, CVM`   | `fcvtstg. RT, FRB, CVM, 2`  |
 820 | `fcvtstgwo RT, FRB, CVM`   | `fcvtstgo RT, FRB, CVM, 0`  | `fcvtstgdo RT, FRB, CVM`   | `fcvtstgo RT, FRB, CVM, 2`  |
 821 | `fcvtstgwo. RT, FRB, CVM`  | `fcvtstgo. RT, FRB, CVM, 0` | `fcvtstgdo. RT, FRB, CVM`  | `fcvtstgo. RT, FRB, CVM, 2` |
 822 | `fcvtstguw RT, FRB, CVM`   | `fcvtstg RT, FRB, CVM, 1`   | `fcvtstgud RT, FRB, CVM`   | `fcvtstg RT, FRB, CVM, 3`   |
 823 | `fcvtstguw. RT, FRB, CVM`  | `fcvtstg. RT, FRB, CVM, 1`  | `fcvtstgud. RT, FRB, CVM`  | `fcvtstg. RT, FRB, CVM, 3`  |
 824 | `fcvtstguwo RT, FRB, CVM`  | `fcvtstgo RT, FRB, CVM, 1`  | `fcvtstgudo RT, FRB, CVM`  | `fcvtstgo RT, FRB, CVM, 3`  |
 825 | `fcvtstguwo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 1` | `fcvtstgudo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 3` |
 826
 827 ----------
 828
 829 \newpage{}
 830
 831 ----------
 832
 833 # Appendices
 834
 835     Appendix E Power ISA sorted by opcode
 836     Appendix F Power ISA sorted by version
 837     Appendix G Power ISA sorted by Compliancy Subset
 838     Appendix H Power ISA sorted by mnemonic
 839
 840 |Form| Book | Page | Version | mnemonic | Description |
 841 |----|------|------|---------|----------|-------------|
 842 |VA  | I    | #    | 3.2B    |todo   | |
 843
 844 ----------------
 845
 846 [[!tag opf_rfc]]