openpower/sv/rfc/ls006.mdwn

   1 # RFC ls006 FPR <-> GPR Move/Conversion
   2
   3 **URLs**:
   4
   5 * <https://libre-soc.org/openpower/sv/int_fp_mv/>
   6 * <https://libre-soc.org/openpower/sv/rfc/ls006/>
   7 * <https://bugs.libre-soc.org/show_bug.cgi?id=1015>
   8 * <https://git.openpower.foundation/isa/PowerISA/issues/todo>
   9
  10 **Severity**: Major
  11
  12 **Status**: New
  13
  14 **Date**: 20 Oct 2022
  15
  16 **Target**: v3.2B
  17
  18 **Source**: v3.1B
  19
  20 **Books and Section affected**: **UPDATE**
  21
  22 * Book I 4.6.5 Floating-Point Move Instructions
  23 * Book I 4.6.7.2 Floating-Point Convert To/From Integer Instructions
  24 * Appendix E Power ISA sorted by opcode
  25 * Appendix F Power ISA sorted by version
  26 * Appendix G Power ISA sorted by Compliancy Subset
  27 * Appendix H Power ISA sorted by mnemonic
  28
  29 **Summary**
  30
  31 Instructions added
  32
  33 * `fmvtg` -- Floating Move to GPR
  34 * `fmvfg` -- Floating Move from GPR
  35 * `fcvttg`/`fcvttgo` -- Floating Convert to Integer in GPR
  36 * `fcvtfg` -- Floating Convert from Integer in GPR
  37
  38 **Submitter**: Luke Leighton (Libre-SOC)
  39
  40 **Requester**: Libre-SOC
  41
  42 **Impact on processor**:
  43
  44 * Addition of five new GPR-FPR-based instructions
  45
  46 **Impact on software**:
  47
  48 * Requires support for new instructions in assembler, debuggers,
  49   and related tools.
  50
  51 **Keywords**:
  52
  53 ```
  54     GPR, FPR, Move, Conversion, JavaScript
  55 ```
  56
  57 **Motivation**
  58
  59 CPUs without VSX/VMX lack a way to efficiently transfer data between
  60 FPRs and GPRs, they need to go through memory, this proposal adds more
  61 efficient data transfer (both bitwise copy and Integer <-> FP conversion)
  62 instructions that transfer directly between FPRs and GPRs without needing
  63 to go through memory.
  64
  65 IEEE 754 doesn't specify what results are obtained when converting a NaN
  66 or out-of-range floating-point value to integer, so different programming
  67 languages and ISAs have made different choices.  Below is an overview
  68 of the different variants, listing the languages and hardware that
  69 implements each variant.
  70
  71 **Notes and Observations**:
  72
  73 * These instructions are present in many other ISAs.
  74 * JavaScript rounding as one instruction saves 35 instructions including
  75   six branches. (FIXME: disagrees with int_fp_mv and int_fp_mv/appendix)
  76
  77 **Changes**
  78
  79 Add the following entries to:
  80
  81 * Book I 4.6.5 Floating-Point Move Instructions
  82 * Book I 4.6.7.2 Floating-Point Convert To/From Integer Instructions
  83 * Book I 1.6.1 and 1.6.2
  84
  85 ----------------
  86
  87 \newpage{}
  88
  89 # Immediate Tables
  90
  91 Tables that are used by
  92 `fmvtg[s][.]`/`fmvfg[s][.]`/`fcvttg[s][.]`/`fcvtfg[s][.]`:
  93
  94 ## `RCS` -- `Rc` and `s`
  95
  96 | `RCS` | `Rc` | FP Single Mode | Assembly Alias Mnemonic |
  97 |-------|------|----------------|-------------------------|
  98 | 0     | 0    | Double         | `<op>`                  |
  99 | 1     | 1    | Double         | `<op>.`                 |
 100 | 2     | 0    | Single         | `<op>s`                 |
 101 | 3     | 1    | Single         | `<op>s.`                |
 102
 103 ## `IT` -- Integer Type
 104
 105 | `IT` | Integer Type    | Assembly Alias Mnemonic |
 106 |------|-----------------|-------------------------|
 107 | 0    | Signed 32-bit   | `<op>w`                 |
 108 | 1    | Unsigned 32-bit | `<op>uw`                |
 109 | 2    | Signed 64-bit   | `<op>d`                 |
 110 | 3    | Unsigned 64-bit | `<op>ud`                |
 111
 112 ## `CVM` -- Float to Integer Conversion Mode
 113
 114 | `CVM` | `rounding_mode` | Semantics                        |
 115 |-------|-----------------|----------------------------------|
 116 | 000   | from `FPSCR`    | [OpenPower semantics]            |
 117 | 001   | Truncate        | [OpenPower semantics]            |
 118 | 010   | from `FPSCR`    | [Java/Saturating semantics]      |
 119 | 011   | Truncate        | [Java/Saturating semantics]      |
 120 | 100   | from `FPSCR`    | [JavaScript semantics]           |
 121 | 101   | Truncate        | [JavaScript semantics]           |
 122 | rest  | --              | illegal instruction trap for now |
 123
 124 [OpenPower semantics]: #fp-to-int-openpower-conversion-semantics
 125 [Java/Saturating semantics]: #fp-to-int-java-saturating-conversion-semantics
 126 [JavaScript semantics]: #fp-to-int-javascript-conversion-semantics
 127
 128 ----------
 129
 130 \newpage{}
 131
 132 ## FPR to GPR Move
 133
 134 `fmvtg RT, FRB`
 135 `fmvtg. RT, FRB`
 136
 137 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form   |
 138 |-----|------|-------|-------|-------|----|--------|
 139 | PO  | RT   | 0     | FRB   | XO    | Rc | X-Form |
 140
 141 ```
 142     RT <- (FRB)
 143 ```
 144
 145 Move a 64-bit float from a FPR to a GPR, just copying bits of the IEEE 754
 146 representation directly. This is equivalent to `stfd` followed by `ld`.
 147 As `fmvtg` is just copying bits, `FPSCR` is not affected in any way.
 148
 149 Rc=1 tests RT and sets CR0, exactly like all other Scalar Fixed-Point
 150 operations.
 151
 152 Special Registers altered:
 153
 154     CR0     (if Rc=1)
 155
 156 ----------
 157
 158 \newpage{}
 159
 160 ## FPR to GPR Move Single
 161
 162 `fmvtgs RT, FRB`
 163 `fmvtgs. RT, FRB`
 164
 165 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form   |
 166 |-----|------|-------|-------|-------|----|--------|
 167 | PO  | RT   | 0     | FRB   | XO    | Rc | X-Form |
 168
 169 ```
 170     RT <- [0] * 32 || SINGLE((FRB))  # SINGLE since that's what stfs uses
 171 ```
 172
 173 Move a 32-bit float from a FPR to a GPR, just copying bits of the IEEE 754
 174 representation directly. This is equivalent to `stfs` followed by `lwz`.
 175 As `fmvtgs` is just copying bits, `FPSCR` is not affected in any way.
 176
 177 Rc=1 tests RT and sets CR0, exactly like all other Scalar Fixed-Point
 178 operations.
 179
 180 Special Registers altered:
 181
 182     CR0     (if Rc=1)
 183
 184 ----------
 185
 186 \newpage{}
 187
 188 ## GPR to FPR Move
 189
 190 `fmvfg FRT, RB`
 191 `fmvfg. FRT, RB`
 192
 193 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form   |
 194 |-----|------|-------|-------|-------|----|--------|
 195 | PO  | FRT  | 0     | RB    | XO    | Rc | X-Form |
 196
 197 ```
 198     FRT <- (RB)
 199 ```
 200
 201 move a 64-bit float from a GPR to a FPR, just copying bits of the IEEE 754
 202 representation directly. This is equivalent to `std` followed by `lfd`.
 203 As `fmvfg` is just copying bits, `FPSCR` is not affected in any way.
 204
 205 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
 206 operations.
 207
 208 Special Registers altered:
 209
 210     CR1     (if Rc=1)
 211
 212 ----------
 213
 214 \newpage{}
 215
 216 ## GPR to FPR Move Single
 217
 218 `fmvfgs FRT, RB`
 219 `fmvfgs. FRT, RB`
 220
 221 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form   |
 222 |-----|------|-------|-------|-------|----|--------|
 223 | PO  | FRT  | 0     | RB    | XO    | Rc | X-Form |
 224
 225 ```
 226     FRT <- DOUBLE((RB)[32:63])  # DOUBLE since that's what lfs uses
 227 ```
 228
 229 move a 32-bit float from a GPR to a FPR, just copying bits of the IEEE 754
 230 representation directly. This is equivalent to `stw` followed by `lfs`.
 231 As `fmvfgs` is just copying bits, `FPSCR` is not affected in any way.
 232
 233 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
 234 operations.
 235
 236 Special Registers altered:
 237
 238     CR1     (if Rc=1)
 239
 240 ----------
 241
 242 \newpage{}
 243
 244 ## Floating-point Convert From GPR
 245
 246 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-30 | 31 | Form   |
 247 |-----|------|-------|-------|-------|-------|----|--------|
 248 | PO  | FRT  | IT    | 0     | RB    | XO    | Rc | X-Form |
 249
 250 `fcvtfg FRT, RB, IT`
 251 `fcvtfg. FRT, RB, IT`
 252 `fcvtfgs FRT, RB, IT`
 253 `fcvtfgs. FRT, RB, IT`
 254
 255 ```
 256     if IT[0] = 0 then  # 32-bit int -> 64-bit float
 257         # rounding never necessary, so don't touch FPSCR
 258         # based off xvcvsxwdp
 259         if IT = 0 then  # Signed 32-bit
 260             src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
 261         else  # IT = 1 -- Unsigned 32-bit
 262             src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
 263         FRT <- bfp64_CONVERT_FROM_BFP(src)
 264     else
 265         # rounding may be necessary. based off xscvuxdsp
 266         reset_xflags()
 267         switch(IT)
 268             case(0):  # Signed 32-bit
 269                 src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
 270             case(1):  # Unsigned 32-bit
 271                 src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
 272             case(2):  # Signed 64-bit
 273                 src <- bfp_CONVERT_FROM_SI64((RB))
 274             default:  # Unsigned 64-bit
 275                 src <- bfp_CONVERT_FROM_UI64((RB))
 276         rnd <- bfp_ROUND_TO_BFP64(FPSCR.RN, src)
 277         result <- bfp64_CONVERT_FROM_BFP(rnd)
 278         cls <- fprf_CLASS_BFP64(result)
 279
 280         if xx_flag = 1 then SetFX(FPSCR.XX)
 281
 282         FRT <- result
 283         FPSCR.FPRF <- cls
 284         FPSCR.FR <- inc_flag
 285         FPSCR.FI <- xx_flag
 286 ```
 287 <!-- note the PowerISA spec. explicitly has empty lines before/after SetFX,
 288 don't remove them -->
 289
 290 Convert from a unsigned/signed 32/64-bit integer in RB to a 64-bit
 291 float in FRT.
 292
 293 If converting from a unsigned/signed 32-bit integer to a 64-bit float,
 294 rounding is never necessary, so `FPSCR` is unmodified and exceptions are
 295 never raised. Otherwise, `FPSCR` is modified and exceptions are raised
 296 as usual.
 297
 298 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
 299 operations.
 300
 301 Special Registers altered:
 302
 303     CR1     (if Rc=1)
 304     FPCSR   (TODO: which bits?) (if IT[0]=1)
 305
 306 ### Assembly Aliases
 307
 308 | Assembly Alias       | Full Instruction     |&nbsp;| Assembly Alias       | Full Instruction     |
 309 |----------------------|----------------------|------|----------------------|----------------------|
 310 | `fcvtfgw FRT, RB`    | `fcvtfg FRT, RB, 0`  |&nbsp;| `fcvtfgd FRT, RB`    | `fcvtfg FRT, RB, 2`  |
 311 | `fcvtfgw. FRT, RB`   | `fcvtfg. FRT, RB, 0` |&nbsp;| `fcvtfgd. FRT, RB`   | `fcvtfg. FRT, RB, 2` |
 312 | `fcvtfguw FRT, RB`   | `fcvtfg FRT, RB, 1`  |&nbsp;| `fcvtfgud FRT, RB`   | `fcvtfg FRT, RB, 3`  |
 313 | `fcvtfguw. FRT, RB`  | `fcvtfg. FRT, RB, 1` |&nbsp;| `fcvtfgud. FRT, RB`  | `fcvtfg. FRT, RB, 3` |
 314
 315 ----------
 316
 317 \newpage{}
 318
 319 ## Floating-point Convert From GPR Single
 320
 321 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-30 | 31 | Form   |
 322 |-----|------|-------|-------|-------|-------|----|--------|
 323 | PO  | FRT  | IT    | 0     | RB    | XO    | Rc | X-Form |
 324
 325 `fcvtfgs FRT, RB, IT`
 326 `fcvtfgs. FRT, RB, IT`
 327
 328 ```
 329     # rounding may be necessary. based off xscvuxdsp
 330     reset_xflags()
 331     switch(IT)
 332         case(0):  # Signed 32-bit
 333             src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
 334         case(1):  # Unsigned 32-bit
 335             src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
 336         case(2):  # Signed 64-bit
 337             src <- bfp_CONVERT_FROM_SI64((RB))
 338         default:  # Unsigned 64-bit
 339             src <- bfp_CONVERT_FROM_UI64((RB))
 340     rnd <- bfp_ROUND_TO_BFP32(FPSCR.RN, src)
 341     result32 <- bfp32_CONVERT_FROM_BFP(rnd)
 342     cls <- fprf_CLASS_BFP32(result32)
 343     result <- DOUBLE(result32)
 344
 345     if xx_flag = 1 then SetFX(FPSCR.XX)
 346
 347     FRT <- result
 348     FPSCR.FPRF <- cls
 349     FPSCR.FR <- inc_flag
 350     FPSCR.FI <- xx_flag
 351 ```
 352 <!-- note the PowerISA spec. explicitly has empty lines before/after SetFX,
 353 don't remove them -->
 354
 355 Convert from a unsigned/signed 32/64-bit integer in RB to a 32-bit
 356 float in FRT, following the usual 32-bit float in 64-bit float format.
 357 `FPSCR` is modified and exceptions are raised as usual.
 358
 359 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
 360 operations.
 361
 362 Special Registers altered:
 363
 364     CR1     (if Rc=1)
 365     FPCSR   (TODO: which bits?)
 366
 367 ### Assembly Aliases
 368
 369 | Assembly Alias       | Full Instruction     |&nbsp;| Assembly Alias       | Full Instruction     |
 370 |----------------------|----------------------|------|----------------------|----------------------|
 371 | `fcvtfgws FRT, RB`   | `fcvtfg FRT, RB, 0`  |&nbsp;| `fcvtfgds FRT, RB`   | `fcvtfg FRT, RB, 2`  |
 372 | `fcvtfgws. FRT, RB`  | `fcvtfg. FRT, RB, 0` |&nbsp;| `fcvtfgds. FRT, RB`  | `fcvtfg. FRT, RB, 2` |
 373 | `fcvtfguws FRT, RB`  | `fcvtfg FRT, RB, 1`  |&nbsp;| `fcvtfguds FRT, RB`  | `fcvtfg FRT, RB, 3`  |
 374 | `fcvtfguws. FRT, RB` | `fcvtfg. FRT, RB, 1` |&nbsp;| `fcvtfguds. FRT, RB` | `fcvtfg. FRT, RB, 3` |
 375
 376 ----------
 377
 378 \newpage{}
 379
 380 ## Floating-point to Integer Conversion Overview
 381
 382 <div id="fpr-to-gpr-conversion-mode"></div>
 383
 384 IEEE 754 doesn't specify what results are obtained when converting a NaN
 385 or out-of-range floating-point value to integer, so different programming
 386 languages and ISAs have made different choices.  Below is an overview
 387 of the different variants, listing the languages and hardware that
 388 implements each variant.
 389
 390 For convenience, we will give those different conversion semantics names
 391 based on which common ISA or programming language uses them, since there
 392 may not be an established name for them:
 393
 394 **Standard OpenPower conversion**
 395
 396 This conversion performs "saturation with NaN converted to minimum
 397 valid integer". This is also exactly the same as the x86 ISA conversion
 398 semantics.  OpenPOWER however has instructions for both:
 399
 400 * rounding mode read from FPSCR
 401 * rounding mode always set to truncate
 402
 403 **Java/Saturating conversion**
 404
 405 For the sake of simplicity, the FP -> Integer conversion semantics
 406 generalized from those used by Java's semantics (and Rust's `as`
 407 operator) will be referred to as [Java/Saturating conversion
 408 semantics](#fp-to-int-java-saturating-conversion-semantics).
 409
 410 Those same semantics are used in some way by all of the following
 411 languages (not necessarily for the default conversion method):
 412
 413 * Java's
 414   [FP -> Integer conversion](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
 415   (only for long/int results)
 416 * Rust's FP -> Integer conversion using the
 417   [`as` operator](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
 418 * LLVM's
 419   [`llvm.fptosi.sat`](https://llvm.org/docs/LangRef.html#llvm-fptosi-sat-intrinsic) and
 420   [`llvm.fptoui.sat`](https://llvm.org/docs/LangRef.html#llvm-fptoui-sat-intrinsic) intrinsics
 421 * SPIR-V's OpenCL dialect's
 422   [`OpConvertFToU`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToU) and
 423   [`OpConvertFToS`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToS)
 424   instructions when decorated with
 425   [the `SaturatedConversion` decorator](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#_a_id_decoration_a_decoration).
 426 * WebAssembly has also introduced
 427  [trunc_sat_u](ttps://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-u) and
 428  [trunc_sat_s](https://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-s)
 429
 430 **JavaScript conversion**
 431
 432 For the sake of simplicity, the FP -> Integer conversion
 433 semantics generalized from those used by JavaScripts's `ToInt32`
 434 abstract operation will be referred to as [JavaScript conversion
 435 semantics](#fp-to-int-javascript-conversion-semantics).
 436
 437 This instruction is present in ARM assembler as FJCVTZS
 438 <https://developer.arm.com/documentation/dui0801/g/hko1477562192868>
 439
 440 **Rc=1 and OE=1**
 441
 442 All of these instructions have an Rc=1 mode which sets CR0
 443 in the normal way for any instructions producing a GPR result.
 444 Additionally, when OE=1, if the numerical value of the FP number
 445 is not 100% accurately preserved (due to truncation or saturation
 446 and including when the FP number was NaN) then this is considered
 447 to be an integer Overflow condition, and CR0.SO, XER.SO and XER.OV
 448 are all set as normal for any GPR instructions that overflow.
 449
 450 \newpage{}
 451
 452 ### FP to Integer Conversion Simplified Pseudo-code
 453
 454 Key for pseudo-code:
 455
 456 | term                      | result type | definition                                                                                         |
 457 |---------------------------|-------------|----------------------------------------------------------------------------------------------------|
 458 | `fp`                      | --          | `f32` or `f64` (or other types from SimpleV)                                                       |
 459 | `int`                     | --          | `u32`/`u64`/`i32`/`i64` (or other types from SimpleV)                                              |
 460 | `uint`                    | --          | the unsigned integer of the same bit-width as `int`                                                |
 461 | `int::BITS`               | `int`       | the bit-width of `int`                                                                             |
 462 | `uint::MIN_VALUE`         | `uint`      | the minimum value `uint` can store: `0`                   |
 463 | `uint::MAX_VALUE`          | `uint`       | the maximum value `uint` can store: `2^int::BITS - 1`  |
 464 | `int::MIN_VALUE`          | `int`       | the minimum value `int` can store : `-2^(int::BITS-1)`              |
 465 | `int::MAX_VALUE`          | `int`       | the maximum value `int` can store :  `2^(int::BITS-1) - 1`  |
 466 | `int::VALUE_COUNT`        | Integer     | the number of different values `int` can store (`2^int::BITS`). too big to fit in `int`.           |
 467 | `rint(fp, rounding_mode)` | `fp`        | rounds the floating-point value `fp` to an integer according to rounding mode `rounding_mode`      |
 468
 469 <div id="fp-to-int-openpower-conversion-semantics"></div>
 470 OpenPower conversion semantics (section A.2 page 1009 (page 1035) of
 471 Power ISA v3.1B):
 472
 473 ```
 474     def fp_to_int_open_power<fp, int>(v: fp) -> int:
 475         if v is NaN:
 476             return int::MIN_VALUE
 477         if v >= int::MAX_VALUE:
 478             return int::MAX_VALUE
 479         if v <= int::MIN_VALUE:
 480             return int::MIN_VALUE
 481         return (int)rint(v, rounding_mode)
 482 ```
 483
 484 <div id="fp-to-int-java-saturating-conversion-semantics"></div>
 485 [Java/Saturating conversion semantics](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
 486 (only for long/int results)/
 487 [Rust semantics](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
 488 (with adjustment to add non-truncate rounding modes):
 489
 490 ```
 491     def fp_to_int_java_saturating<fp, int>(v: fp) -> int:
 492         if v is NaN:
 493             return 0
 494         if v >= int::MAX_VALUE:
 495             return int::MAX_VALUE
 496         if v <= int::MIN_VALUE:
 497             return int::MIN_VALUE
 498         return (int)rint(v, rounding_mode)
 499 ```
 500
 501 <div id="fp-to-int-javascript-conversion-semantics"></div>
 502 Section 7.1 of the ECMAScript / JavaScript
 503 [conversion semantics](https://262.ecma-international.org/11.0/#sec-toint32)
 504 (with adjustment to add non-truncate rounding modes):
 505
 506 ```
 507     def fp_to_int_java_script<fp, int>(v: fp) -> int:
 508         if v is NaN or infinite:
 509             return 0
 510         v = rint(v, rounding_mode)  # assume no loss of precision in result
 511         v = v mod int::VALUE_COUNT  # 2^32 for i32, 2^64 for i64, result is non-negative
 512         bits = (uint)v
 513         return (int)bits
 514 ```
 515
 516
 517 ----------
 518
 519 \newpage{}
 520
 521
 522 ## Floating-point Convert To GPR
 523
 524 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-29 | 30 | 31 | Form    |
 525 |-----|------|-------|-------|-------|-------|----|----|---------|
 526 | PO  | RT   | IT    | CVM   | FRB   | XO    | OE | Rc | XO-Form |
 527
 528 `fcvttg RT, FRB, CVM, IT`
 529 `fcvttg. RT, FRB, CVM, IT`
 530 `fcvttgo RT, FRB, CVM, IT`
 531 `fcvttgo. RT, FRB, CVM, IT`
 532
 533 ```
 534     # based on xscvdpuxws
 535     reset_xflags()
 536     src <- bfp_CONVERT_FROM_BFP64((FRB))
 537
 538     switch(IT)
 539         case(0):  # Signed 32-bit
 540             range_min <- bfp_CONVERT_FROM_SI32(0x8000_0000)
 541             range_max <- bfp_CONVERT_FROM_SI32(0x7FFF_FFFF)
 542             js_mask <- 0xFFFF_FFFF
 543         case(1):  # Unsigned 32-bit
 544             range_min <- bfp_CONVERT_FROM_UI32(0)
 545             range_max <- bfp_CONVERT_FROM_UI32(0xFFFF_FFFF)
 546             js_mask <- 0xFFFF_FFFF
 547         case(2):  # Signed 64-bit
 548             range_min <- bfp_CONVERT_FROM_SI64(-0x8000_0000_0000_0000)
 549             range_max <- bfp_CONVERT_FROM_SI64(0x7FFF_FFFF_FFFF_FFFF)
 550             js_mask <- 0xFFFF_FFFF_FFFF_FFFF
 551         default:  # Unsigned 64-bit
 552             range_min <- bfp_CONVERT_FROM_UI64(0)
 553             range_max <- bfp_CONVERT_FROM_UI64(0xFFFF_FFFF_FFFF_FFFF)
 554             js_mask <- 0xFFFF_FFFF_FFFF_FFFF
 555
 556     if CVM[2] = 1 or FPSCR.RN = 0b01 then
 557         rnd <- bfp_ROUND_TO_INTEGER_TRUNC(src)
 558     else if FPSCR.RN = 0b00 then
 559         rnd <- bfp_ROUND_TO_INTEGER_NEAR_EVEN(src)
 560     else if FPSCR.RN = 0b10 then
 561         rnd <- bfp_ROUND_TO_INTEGER_CEIL(src)
 562     else if FPSCR.RN = 0b11 then
 563         rnd <- bfp_ROUND_TO_INTEGER_FLOOR(src)
 564
 565     switch(CVM)
 566         case(0, 1):  # OpenPower semantics
 567             if IsNaN(rnd) then
 568                 result <- si64_CONVERT_FROM_BFP(range_min)
 569             else if bfp_COMPARE_GT(rnd, range_max) then
 570                 result <- ui64_CONVERT_FROM_BFP(range_max)
 571             else if bfp_COMPARE_LT(rnd, range_min) then
 572                 result <- si64_CONVERT_FROM_BFP(range_min)
 573             else if IT[1] = 1 then  # Unsigned 32/64-bit
 574                 result <- ui64_CONVERT_FROM_BFP(range_max)
 575             else  # Signed 32/64-bit
 576                 result <- si64_CONVERT_FROM_BFP(range_max)
 577         case(2, 3):  # Java/Saturating semantics
 578             if IsNaN(rnd) then
 579                 result <- [0] * 64
 580             else if bfp_COMPARE_GT(rnd, range_max) then
 581                 result <- ui64_CONVERT_FROM_BFP(range_max)
 582             else if bfp_COMPARE_LT(rnd, range_min) then
 583                 result <- si64_CONVERT_FROM_BFP(range_min)
 584             else if IT[1] = 1 then  # Unsigned 32/64-bit
 585                 result <- ui64_CONVERT_FROM_BFP(range_max)
 586             else  # Signed 32/64-bit
 587                 result <- si64_CONVERT_FROM_BFP(range_max)
 588         default:  # JavaScript semantics
 589             # CVM = 6, 7 are illegal instructions
 590             # this works because the largest type we try to convert from has
 591             # 53 significand bits, and the largest type we try to convert to
 592             # has 64 bits, and the sum of those is strictly less than the 128
 593             # bits of the intermediate result.
 594             limit <- bfp_CONVERT_FROM_UI128([1] * 128)
 595             if IsInf(rnd) or IsNaN(rnd) then
 596                 result <- [0] * 64
 597             else if bfp_COMPARE_GT(bfp_ABSOLUTE(rnd), limit) then
 598                 result <- [0] * 64
 599             else
 600                 result128 <- si128_CONVERT_FROM_BFP(rnd)
 601                 result <- result128[64:127] & js_mask
 602
 603     switch(IT)
 604         case(0):  # Signed 32-bit
 605             result <- EXTS64(result[32:63])
 606             result_bfp <- bfp_CONVERT_FROM_SI32(result[32:63])
 607         case(1):  # Unsigned 32-bit
 608             result <- EXTZ64(result[32:63])
 609             result_bfp <- bfp_CONVERT_FROM_UI32(result[32:63])
 610         case(2):  # Signed 64-bit
 611             result_bfp <- bfp_CONVERT_FROM_SI64(result)
 612         default:  # Unsigned 64-bit
 613             result_bfp <- bfp_CONVERT_FROM_UI64(result)
 614
 615     if vxsnan_flag = 1 then SetFX(FPSCR.VXSNAN)
 616     if vxcvi_flag = 1 then SetFX(FPSCR.VXCVI)
 617     if xx_flag = 1 then SetFX(FPSCR.XX)
 618
 619     vx_flag <- vxsnan_flag | vxcvi_flag
 620     vex_flag <- FPSCR.VE & vx_flag
 621
 622     if vex_flag = 0 then
 623         RT <- result
 624         FPSCR.FPRF <- undefined
 625         FPSCR.FR <- inc_flag
 626         FPSCR.FI <- xx_flag
 627         if IsNaN(src) or not bfp_COMPARE_EQ(src, result_bfp) then
 628             overflow <- 1  # signals SO only when OE = 1
 629     else
 630         FPSCR.FR <- 0
 631         FPSCR.FI <- 0
 632 ```
 633
 634 Convert from 64-bit float in FRB to a unsigned/signed 32/64-bit integer
 635 in RT, with the conversion overflow/rounding semantics following the
 636 chosen `CVM` value. `FPSCR` is modified and exceptions are raised as usual.
 637
 638 These instructions have an Rc=1 mode which sets CR0 in the normal
 639 way for any instructions producing a GPR result.  Additionally, when OE=1,
 640 if the numerical value of the FP number is not 100% accurately preserved
 641 (due to truncation or saturation and including when the FP number was
 642 NaN) then this is considered to be an Integer Overflow condition, and
 643 CR0.SO, XER.SO and XER.OV are all set as normal for any GPR instructions
 644 that overflow.
 645
 646 Special Registers altered:
 647
 648     CR0              (if Rc=1)
 649     XER SO, OV, OV32 (if OE=1)
 650     FPCSR   (TODO: which bits?)
 651
 652 ### Assembly Aliases
 653
 654 | Assembly Alias            | Full Instruction           | Assembly Alias            | Full Instruction           |
 655 |---------------------------|----------------------------|---------------------------|----------------------------|
 656 | `fcvttgw RT, FRB, CVM`    | `fcvttg RT, FRB, CVM, 0`   | `fcvttgd RT, FRB, CVM`    | `fcvttg RT, FRB, CVM, 2`   |
 657 | `fcvttgw. RT, FRB, CVM`   | `fcvttg. RT, FRB, CVM, 0`  | `fcvttgd. RT, FRB, CVM`   | `fcvttg. RT, FRB, CVM, 2`  |
 658 | `fcvttgwo RT, FRB, CVM`   | `fcvttgo RT, FRB, CVM, 0`  | `fcvttgdo RT, FRB, CVM`   | `fcvttgo RT, FRB, CVM, 2`  |
 659 | `fcvttgwo. RT, FRB, CVM`  | `fcvttgo. RT, FRB, CVM, 0` | `fcvttgdo. RT, FRB, CVM`  | `fcvttgo. RT, FRB, CVM, 2` |
 660 | `fcvttguw RT, FRB, CVM`   | `fcvttg RT, FRB, CVM, 1`   | `fcvttgud RT, FRB, CVM`   | `fcvttg RT, FRB, CVM, 3`   |
 661 | `fcvttguw. RT, FRB, CVM`  | `fcvttg. RT, FRB, CVM, 1`  | `fcvttgud. RT, FRB, CVM`  | `fcvttg. RT, FRB, CVM, 3`  |
 662 | `fcvttguwo RT, FRB, CVM`  | `fcvttgo RT, FRB, CVM, 1`  | `fcvttgudo RT, FRB, CVM`  | `fcvttgo RT, FRB, CVM, 3`  |
 663 | `fcvttguwo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 1` | `fcvttgudo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 3` |
 664
 665 ----------
 666
 667 \newpage{}
 668
 669 ## Floating-point Convert To GPR Single
 670
 671 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-29 | 30 | 31 | Form    |
 672 |-----|------|-------|-------|-------|-------|----|----|---------|
 673 | PO  | RT   | IT    | CVM   | FRB   | XO    | OE | Rc | XO-Form |
 674
 675 `fcvtstg RT, FRB, CVM, IT`
 676 `fcvtstg. RT, FRB, CVM, IT`
 677 `fcvtstgo RT, FRB, CVM, IT`
 678 `fcvtstgo. RT, FRB, CVM, IT`
 679
 680 ```
 681     # based on xscvdpuxws
 682     reset_xflags()
 683     src <- bfp_CONVERT_FROM_BFP32(SINGLE((FRB)))
 684
 685     switch(IT)
 686         case(0):  # Signed 32-bit
 687             range_min <- bfp_CONVERT_FROM_SI32(0x8000_0000)
 688             range_max <- bfp_CONVERT_FROM_SI32(0x7FFF_FFFF)
 689             js_mask <- 0xFFFF_FFFF
 690         case(1):  # Unsigned 32-bit
 691             range_min <- bfp_CONVERT_FROM_UI32(0)
 692             range_max <- bfp_CONVERT_FROM_UI32(0xFFFF_FFFF)
 693             js_mask <- 0xFFFF_FFFF
 694         case(2):  # Signed 64-bit
 695             range_min <- bfp_CONVERT_FROM_SI64(-0x8000_0000_0000_0000)
 696             range_max <- bfp_CONVERT_FROM_SI64(0x7FFF_FFFF_FFFF_FFFF)
 697             js_mask <- 0xFFFF_FFFF_FFFF_FFFF
 698         default:  # Unsigned 64-bit
 699             range_min <- bfp_CONVERT_FROM_UI64(0)
 700             range_max <- bfp_CONVERT_FROM_UI64(0xFFFF_FFFF_FFFF_FFFF)
 701             js_mask <- 0xFFFF_FFFF_FFFF_FFFF
 702
 703     if CVM[2] = 1 or FPSCR.RN = 0b01 then
 704         rnd <- bfp_ROUND_TO_INTEGER_TRUNC(src)
 705     else if FPSCR.RN = 0b00 then
 706         rnd <- bfp_ROUND_TO_INTEGER_NEAR_EVEN(src)
 707     else if FPSCR.RN = 0b10 then
 708         rnd <- bfp_ROUND_TO_INTEGER_CEIL(src)
 709     else if FPSCR.RN = 0b11 then
 710         rnd <- bfp_ROUND_TO_INTEGER_FLOOR(src)
 711
 712     switch(CVM)
 713         case(0, 1):  # OpenPower semantics
 714             if IsNaN(rnd) then
 715                 result <- si64_CONVERT_FROM_BFP(range_min)
 716             else if bfp_COMPARE_GT(rnd, range_max) then
 717                 result <- ui64_CONVERT_FROM_BFP(range_max)
 718             else if bfp_COMPARE_LT(rnd, range_min) then
 719                 result <- si64_CONVERT_FROM_BFP(range_min)
 720             else if IT[1] = 1 then  # Unsigned 32/64-bit
 721                 result <- ui64_CONVERT_FROM_BFP(range_max)
 722             else  # Signed 32/64-bit
 723                 result <- si64_CONVERT_FROM_BFP(range_max)
 724         case(2, 3):  # Java/Saturating semantics
 725             if IsNaN(rnd) then
 726                 result <- [0] * 64
 727             else if bfp_COMPARE_GT(rnd, range_max) then
 728                 result <- ui64_CONVERT_FROM_BFP(range_max)
 729             else if bfp_COMPARE_LT(rnd, range_min) then
 730                 result <- si64_CONVERT_FROM_BFP(range_min)
 731             else if IT[1] = 1 then  # Unsigned 32/64-bit
 732                 result <- ui64_CONVERT_FROM_BFP(range_max)
 733             else  # Signed 32/64-bit
 734                 result <- si64_CONVERT_FROM_BFP(range_max)
 735         default:  # JavaScript semantics
 736             # CVM = 6, 7 are illegal instructions
 737             # this works because the largest type we try to convert from has
 738             # 53 significand bits, and the largest type we try to convert to
 739             # has 64 bits, and the sum of those is strictly less than the 128
 740             # bits of the intermediate result.
 741             limit <- bfp_CONVERT_FROM_UI128([1] * 128)
 742             if IsInf(rnd) or IsNaN(rnd) then
 743                 result <- [0] * 64
 744             else if bfp_COMPARE_GT(bfp_ABSOLUTE(rnd), limit) then
 745                 result <- [0] * 64
 746             else
 747                 result128 <- si128_CONVERT_FROM_BFP(rnd)
 748                 result <- result128[64:127] & js_mask
 749
 750     switch(IT)
 751         case(0):  # Signed 32-bit
 752             result <- EXTS64(result[32:63])
 753             result_bfp <- bfp_CONVERT_FROM_SI32(result[32:63])
 754         case(1):  # Unsigned 32-bit
 755             result <- EXTZ64(result[32:63])
 756             result_bfp <- bfp_CONVERT_FROM_UI32(result[32:63])
 757         case(2):  # Signed 64-bit
 758             result_bfp <- bfp_CONVERT_FROM_SI64(result)
 759         default:  # Unsigned 64-bit
 760             result_bfp <- bfp_CONVERT_FROM_UI64(result)
 761
 762     if vxsnan_flag = 1 then SetFX(FPSCR.VXSNAN)
 763     if vxcvi_flag = 1 then SetFX(FPSCR.VXCVI)
 764     if xx_flag = 1 then SetFX(FPSCR.XX)
 765
 766     vx_flag <- vxsnan_flag | vxcvi_flag
 767     vex_flag <- FPSCR.VE & vx_flag
 768
 769     if vex_flag = 0 then
 770         RT <- result
 771         FPSCR.FPRF <- undefined
 772         FPSCR.FR <- inc_flag
 773         FPSCR.FI <- xx_flag
 774         if IsNaN(src) or not bfp_COMPARE_EQ(src, result_bfp) then
 775             overflow <- 1  # signals SO only when OE = 1
 776     else
 777         FPSCR.FR <- 0
 778         FPSCR.FI <- 0
 779 ```
 780
 781 Convert from 32-bit float in FRB to a unsigned/signed 32/64-bit integer
 782 in RT, with the conversion overflow/rounding semantics following the
 783 chosen `CVM` value, following the usual 32-bit float in 64-bit float
 784 format. `FPSCR` is modified and exceptions are raised as usual.
 785
 786 These instructions have an Rc=1 mode which sets CR0 in the normal
 787 way for any instructions producing a GPR result.  Additionally, when OE=1,
 788 if the numerical value of the FP number is not 100% accurately preserved
 789 (due to truncation or saturation and including when the FP number was
 790 NaN) then this is considered to be an Integer Overflow condition, and
 791 CR0.SO, XER.SO and XER.OV are all set as normal for any GPR instructions
 792 that overflow.
 793
 794 Special Registers altered:
 795
 796     CR0              (if Rc=1)
 797     XER SO, OV, OV32 (if OE=1)
 798     FPCSR   (TODO: which bits?)
 799
 800 ### Assembly Aliases
 801
 802 | Assembly Alias             | Full Instruction            | Assembly Alias             | Full Instruction            |
 803 |----------------------------|-----------------------------|----------------------------|-----------------------------|
 804 | `fcvtstgw RT, FRB, CVM`    | `fcvtstg RT, FRB, CVM, 0`   | `fcvtstgd RT, FRB, CVM`    | `fcvtstg RT, FRB, CVM, 2`   |
 805 | `fcvtstgw. RT, FRB, CVM`   | `fcvtstg. RT, FRB, CVM, 0`  | `fcvtstgd. RT, FRB, CVM`   | `fcvtstg. RT, FRB, CVM, 2`  |
 806 | `fcvtstgwo RT, FRB, CVM`   | `fcvtstgo RT, FRB, CVM, 0`  | `fcvtstgdo RT, FRB, CVM`   | `fcvtstgo RT, FRB, CVM, 2`  |
 807 | `fcvtstgwo. RT, FRB, CVM`  | `fcvtstgo. RT, FRB, CVM, 0` | `fcvtstgdo. RT, FRB, CVM`  | `fcvtstgo. RT, FRB, CVM, 2` |
 808 | `fcvtstguw RT, FRB, CVM`   | `fcvtstg RT, FRB, CVM, 1`   | `fcvtstgud RT, FRB, CVM`   | `fcvtstg RT, FRB, CVM, 3`   |
 809 | `fcvtstguw. RT, FRB, CVM`  | `fcvtstg. RT, FRB, CVM, 1`  | `fcvtstgud. RT, FRB, CVM`  | `fcvtstg. RT, FRB, CVM, 3`  |
 810 | `fcvtstguwo RT, FRB, CVM`  | `fcvtstgo RT, FRB, CVM, 1`  | `fcvtstgudo RT, FRB, CVM`  | `fcvtstgo RT, FRB, CVM, 3`  |
 811 | `fcvtstguwo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 1` | `fcvtstgudo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 3` |
 812
 813 ----------
 814
 815 \newpage{}
 816
 817 ----------
 818
 819 # Appendices
 820
 821     Appendix E Power ISA sorted by opcode
 822     Appendix F Power ISA sorted by version
 823     Appendix G Power ISA sorted by Compliancy Subset
 824     Appendix H Power ISA sorted by mnemonic
 825
 826 |Form| Book | Page | Version | mnemonic | Description |
 827 |----|------|------|---------|----------|-------------|
 828 |VA  | I    | #    | 3.2B    |todo   | |
 829
 830 ----------------
 831
 832 [[!tag opf_rfc]]