openpower/sv/rfc/ls006.mdwn

   1 # RFC ls006 FPR <-> GPR Move/Conversion
   2
   3 **URLs**:
   4
   5 * <https://libre-soc.org/openpower/sv/int_fp_mv/>
   6 * <https://libre-soc.org/openpower/sv/rfc/ls006/>
   7 * <https://bugs.libre-soc.org/show_bug.cgi?id=1015>
   8 * <https://git.openpower.foundation/isa/PowerISA/issues/todo>
   9
  10 **Severity**: Major
  11
  12 **Status**: New
  13
  14 **Date**: 20 Oct 2022
  15
  16 **Target**: v3.2B
  17
  18 **Source**: v3.1B
  19
  20 **Books and Section affected**: **UPDATE**
  21
  22 * Book I 4.6.5 Floating-Point Move Instructions
  23 * Book I 4.6.7.2 Floating-Point Convert To/From Integer Instructions
  24 * Appendix E Power ISA sorted by opcode
  25 * Appendix F Power ISA sorted by version
  26 * Appendix G Power ISA sorted by Compliancy Subset
  27 * Appendix H Power ISA sorted by mnemonic
  28
  29 **Summary**
  30
  31 Single-precision Instructions added:
  32
  33 * `fmvtgs` -- Single-Precision Floating Move To GPR
  34 * `fmvfgs` -- Single-Precision Floating Move From GPR
  35 * `fcvttgs` -- Single-Precision Floating Convert To Integer In GPR
  36 * `fcvtfgs` -- Single-Precision Floating Convert From Integer In GPR
  37
  38 Identical (except Double-precision) Instructions added:
  39
  40 * `fmvtg` -- Double-Precision Floating Move To GPR
  41 * `fmvfg` -- Double-Precision Floating Move From GPR
  42 * `fcvttg` -- Double-Precision Floating Convert To Integer In GPR
  43 * `fcvtfg` -- Double-Precision Floating Convert From Integer In GPR
  44
  45 **Submitter**: Luke Leighton (Libre-SOC)
  46
  47 **Requester**: Libre-SOC
  48
  49 **Impact on processor**:
  50
  51 * Addition of four new Single-Precision GPR-FPR-based instructions
  52 * Addition of four new Double-Precision GPR-FPR-based instructions
  53
  54 **Impact on software**:
  55
  56 * Requires support for new instructions in assembler, debuggers,
  57   and related tools.
  58
  59 **Keywords**:
  60
  61 ```
  62     GPR, FPR, Move, Conversion, JavaScript
  63 ```
  64
  65 **Motivation**
  66
  67 CPUs without VSX/VMX lack a way to efficiently transfer data between
  68 FPRs and GPRs, they need to go through memory, this proposal adds more
  69 efficient data transfer (both bitwise copy and Integer <-> FP conversion)
  70 instructions that transfer directly between FPRs and GPRs without needing
  71 to go through memory.
  72
  73 IEEE 754 doesn't specify what results are obtained when converting a NaN
  74 or out-of-range floating-point value to integer, so different programming
  75 languages and ISAs have made different choices.  Below is an overview
  76 of the different variants, listing the languages and hardware that
  77 implements each variant.
  78
  79 **Notes and Observations**:
  80
  81 * These instructions are present in many other ISAs.
  82 * JavaScript rounding as one instruction saves 35 instructions including
  83   six branches. (FIXME: disagrees with int_fp_mv and int_fp_mv/appendix)
  84 * Both sets are orthogonal (no difference except being Single/Double).
  85   This allows IBM to follow the pre-existing precedent of allocating
  86   separate Major Opcodes (PO) for Double-precision and Single-precision
  87   respectively.
  88
  89 **Changes**
  90
  91 Add the following entries to:
  92
  93 * Book I 4.6.5 Floating-Point Move Instructions
  94 * Book I 4.6.7.2 Floating-Point Convert To/From Integer Instructions
  95 * Book I 1.6.1 and 1.6.2
  96
  97 ----------------
  98
  99 \newpage{}
 100
 101 # Immediate Tables
 102
 103 Tables that are used by
 104 `fmvtg[s][.]`/`fmvfg[s][.]`/`fcvt[s]tg[o][.]`/`fcvtfg[s][.]`:
 105
 106 ## `IT` -- Integer Type
 107
 108 | `IT` | Integer Type    | Assembly Alias Mnemonic |
 109 |------|-----------------|-------------------------|
 110 | 0    | Signed 32-bit   | `<op>w`                 |
 111 | 1    | Unsigned 32-bit | `<op>uw`                |
 112 | 2    | Signed 64-bit   | `<op>d`                 |
 113 | 3    | Unsigned 64-bit | `<op>ud`                |
 114
 115 ## `CVM` -- Float to Integer Conversion Mode
 116
 117 | `CVM` | `rounding_mode` | Semantics                        |
 118 |-------|-----------------|----------------------------------|
 119 | 000   | from `FPSCR`    | [OpenPower semantics]            |
 120 | 001   | Truncate        | [OpenPower semantics]            |
 121 | 010   | from `FPSCR`    | [Java/Saturating semantics]      |
 122 | 011   | Truncate        | [Java/Saturating semantics]      |
 123 | 100   | from `FPSCR`    | [JavaScript semantics]           |
 124 | 101   | Truncate        | [JavaScript semantics]           |
 125 | rest  | --              | illegal instruction trap for now |
 126
 127 [OpenPower semantics]: #fp-to-int-openpower-conversion-semantics
 128 [Java/Saturating semantics]: #fp-to-int-java-saturating-conversion-semantics
 129 [JavaScript semantics]: #fp-to-int-javascript-conversion-semantics
 130
 131 ----------
 132
 133 ## Floating Move To GPR
 134
 135 ```
 136     fmvtg RT, FRB
 137     fmvtg. RT, FRB
 138 ```
 139
 140 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form   |
 141 |-----|------|-------|-------|-------|----|--------|
 142 | PO  | RT   | 0     | FRB   | XO    | Rc | X-Form |
 143
 144 ```
 145     RT <- (FRB)
 146 ```
 147
 148 Move a 64-bit float from a FPR to a GPR, just copying bits of the IEEE 754
 149 representation directly. This is equivalent to `stfd` followed by `ld`.
 150 As `fmvtg` is just copying bits, `FPSCR` is not affected in any way.
 151
 152 Rc=1 tests RT and sets CR0, exactly like all other Scalar Fixed-Point
 153 operations.
 154
 155 Special Registers altered:
 156
 157     CR0     (if Rc=1)
 158
 159 ----------
 160
 161 ## Floating Move To GPR Single
 162
 163 ```
 164     fmvtgs RT, FRB
 165     fmvtgs. RT, FRB
 166 ```
 167
 168 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form   |
 169 |-----|------|-------|-------|-------|----|--------|
 170 | PO  | RT   | 0     | FRB   | XO    | Rc | X-Form |
 171
 172 ```
 173     RT <- [0] * 32 || SINGLE((FRB))  # SINGLE since that's what stfs uses
 174 ```
 175
 176 Move a 32-bit float from a FPR to a GPR, just copying bits of the IEEE 754
 177 representation directly. This is equivalent to `stfs` followed by `lwz`.
 178 As `fmvtgs` is just copying bits, `FPSCR` is not affected in any way.
 179
 180 Rc=1 tests RT and sets CR0, exactly like all other Scalar Fixed-Point
 181 operations.
 182
 183 Special Registers altered:
 184
 185     CR0     (if Rc=1)
 186
 187 ----------
 188
 189 \newpage{}
 190
 191 ## Double-Precision Floating Move From GPR
 192
 193 ```
 194     fmvfg FRT, RB
 195     fmvfg. FRT, RB
 196 ```
 197
 198 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form   |
 199 |-----|------|-------|-------|-------|----|--------|
 200 | PO  | FRT  | 0     | RB    | XO    | Rc | X-Form |
 201
 202 ```
 203     FRT <- (RB)
 204 ```
 205
 206 move a 64-bit float from a GPR to a FPR, just copying bits of the IEEE 754
 207 representation directly. This is equivalent to `std` followed by `lfd`.
 208 As `fmvfg` is just copying bits, `FPSCR` is not affected in any way.
 209
 210 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
 211 operations.
 212
 213 Special Registers altered:
 214
 215     CR1     (if Rc=1)
 216
 217 ----------
 218
 219 ## Floating Move From GPR Single
 220
 221 ```
 222     fmvfgs FRT, RB
 223     fmvfgs. FRT, RB
 224 ```
 225
 226 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form   |
 227 |-----|------|-------|-------|-------|----|--------|
 228 | PO  | FRT  | 0     | RB    | XO    | Rc | X-Form |
 229
 230 ```
 231     FRT <- DOUBLE((RB)[32:63])  # DOUBLE since that's what lfs uses
 232 ```
 233
 234 move a 32-bit float from a GPR to a FPR, just copying bits of the IEEE 754
 235 representation directly. This is equivalent to `stw` followed by `lfs`.
 236 As `fmvfgs` is just copying bits, `FPSCR` is not affected in any way.
 237
 238 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
 239 operations.
 240
 241 Special Registers altered:
 242
 243     CR1     (if Rc=1)
 244
 245 ----------
 246
 247 \newpage{}
 248
 249 ## Double-Precision Floating Convert From Integer In GPR
 250
 251 ```
 252     fcvtfg FRT, RB, IT
 253     fcvtfg. FRT, RB, IT
 254 ```
 255
 256 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-30 | 31 | Form   |
 257 |-----|------|-------|-------|-------|-------|----|--------|
 258 | PO  | FRT  | IT    | 0     | RB    | XO    | Rc | X-Form |
 259
 260 ```
 261     if IT[0] = 0 then  # 32-bit int -> 64-bit float
 262         # rounding never necessary, so don't touch FPSCR
 263         # based off xvcvsxwdp
 264         if IT = 0 then  # Signed 32-bit
 265             src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
 266         else  # IT = 1 -- Unsigned 32-bit
 267             src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
 268         FRT <- bfp64_CONVERT_FROM_BFP(src)
 269     else
 270         # rounding may be necessary. based off xscvuxdsp
 271         reset_xflags()
 272         switch(IT)
 273             case(0):  # Signed 32-bit
 274                 src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
 275             case(1):  # Unsigned 32-bit
 276                 src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
 277             case(2):  # Signed 64-bit
 278                 src <- bfp_CONVERT_FROM_SI64((RB))
 279             default:  # Unsigned 64-bit
 280                 src <- bfp_CONVERT_FROM_UI64((RB))
 281         rnd <- bfp_ROUND_TO_BFP64(FPSCR.RN, src)
 282         result <- bfp64_CONVERT_FROM_BFP(rnd)
 283         cls <- fprf_CLASS_BFP64(result)
 284
 285         if xx_flag = 1 then SetFX(FPSCR.XX)
 286
 287         FRT <- result
 288         FPSCR.FPRF <- cls
 289         FPSCR.FR <- inc_flag
 290         FPSCR.FI <- xx_flag
 291 ```
 292 <!-- note the PowerISA spec. explicitly has empty lines before/after SetFX,
 293 don't remove them -->
 294
 295 Convert from a unsigned/signed 32/64-bit integer in RB to a 64-bit
 296 float in FRT.
 297
 298 If converting from a unsigned/signed 32-bit integer to a 64-bit float,
 299 rounding is never necessary, so `FPSCR` is unmodified and exceptions are
 300 never raised. Otherwise, `FPSCR` is modified and exceptions are raised
 301 as usual.
 302
 303 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
 304 operations.
 305
 306 Special Registers altered:
 307
 308     CR1     (if Rc=1)
 309     FPCSR   (TODO: which bits?) (if IT[0]=1)
 310
 311 ### Assembly Aliases
 312
 313 | Assembly Alias       | Full Instruction     |&nbsp;| Assembly Alias       | Full Instruction     |
 314 |----------------------|----------------------|------|----------------------|----------------------|
 315 | `fcvtfgw FRT, RB`    | `fcvtfg FRT, RB, 0`  |&nbsp;| `fcvtfgd FRT, RB`    | `fcvtfg FRT, RB, 2`  |
 316 | `fcvtfgw. FRT, RB`   | `fcvtfg. FRT, RB, 0` |&nbsp;| `fcvtfgd. FRT, RB`   | `fcvtfg. FRT, RB, 2` |
 317 | `fcvtfguw FRT, RB`   | `fcvtfg FRT, RB, 1`  |&nbsp;| `fcvtfgud FRT, RB`   | `fcvtfg FRT, RB, 3`  |
 318 | `fcvtfguw. FRT, RB`  | `fcvtfg. FRT, RB, 1` |&nbsp;| `fcvtfgud. FRT, RB`  | `fcvtfg. FRT, RB, 3` |
 319
 320 ----------
 321
 322 \newpage{}
 323
 324 ## Floating Convert From Integer In GPR Single
 325
 326 ```
 327     fcvtfgs FRT, RB, IT
 328     fcvtfgs. FRT, RB, IT
 329 ```
 330
 331 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-30 | 31 | Form   |
 332 |-----|------|-------|-------|-------|-------|----|--------|
 333 | PO  | FRT  | IT    | 0     | RB    | XO    | Rc | X-Form |
 334
 335 ```
 336     # rounding may be necessary. based off xscvuxdsp
 337     reset_xflags()
 338     switch(IT)
 339         case(0):  # Signed 32-bit
 340             src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
 341         case(1):  # Unsigned 32-bit
 342             src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
 343         case(2):  # Signed 64-bit
 344             src <- bfp_CONVERT_FROM_SI64((RB))
 345         default:  # Unsigned 64-bit
 346             src <- bfp_CONVERT_FROM_UI64((RB))
 347     rnd <- bfp_ROUND_TO_BFP32(FPSCR.RN, src)
 348     result32 <- bfp32_CONVERT_FROM_BFP(rnd)
 349     cls <- fprf_CLASS_BFP32(result32)
 350     result <- DOUBLE(result32)
 351
 352     if xx_flag = 1 then SetFX(FPSCR.XX)
 353
 354     FRT <- result
 355     FPSCR.FPRF <- cls
 356     FPSCR.FR <- inc_flag
 357     FPSCR.FI <- xx_flag
 358 ```
 359 <!-- note the PowerISA spec. explicitly has empty lines before/after SetFX,
 360 don't remove them -->
 361
 362 Convert from a unsigned/signed 32/64-bit integer in RB to a 32-bit
 363 float in FRT, following the usual 32-bit float in 64-bit float format.
 364 `FPSCR` is modified and exceptions are raised as usual.
 365
 366 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
 367 operations.
 368
 369 Special Registers altered:
 370
 371     CR1     (if Rc=1)
 372     FPCSR   (TODO: which bits?)
 373
 374 ### Assembly Aliases
 375
 376 | Assembly Alias       | Full Instruction     |&nbsp;| Assembly Alias       | Full Instruction     |
 377 |----------------------|----------------------|------|----------------------|----------------------|
 378 | `fcvtfgws FRT, RB`   | `fcvtfg FRT, RB, 0`  |&nbsp;| `fcvtfgds FRT, RB`   | `fcvtfg FRT, RB, 2`  |
 379 | `fcvtfgws. FRT, RB`  | `fcvtfg. FRT, RB, 0` |&nbsp;| `fcvtfgds. FRT, RB`  | `fcvtfg. FRT, RB, 2` |
 380 | `fcvtfguws FRT, RB`  | `fcvtfg FRT, RB, 1`  |&nbsp;| `fcvtfguds FRT, RB`  | `fcvtfg FRT, RB, 3`  |
 381 | `fcvtfguws. FRT, RB` | `fcvtfg. FRT, RB, 1` |&nbsp;| `fcvtfguds. FRT, RB` | `fcvtfg. FRT, RB, 3` |
 382
 383 ----------
 384
 385 \newpage{}
 386
 387 ## Floating-point to Integer Conversion Overview
 388
 389 <div id="fpr-to-gpr-conversion-mode"></div>
 390
 391 IEEE 754 doesn't specify what results are obtained when converting a NaN
 392 or out-of-range floating-point value to integer, so different programming
 393 languages and ISAs have made different choices.  Below is an overview
 394 of the different variants, listing the languages and hardware that
 395 implements each variant.
 396
 397 For convenience, we will give those different conversion semantics names
 398 based on which common ISA or programming language uses them, since there
 399 may not be an established name for them:
 400
 401 **Standard OpenPower conversion**
 402
 403 This conversion performs "saturation with NaN converted to minimum
 404 valid integer". This is also exactly the same as the x86 ISA conversion
 405 semantics.  OpenPOWER however has instructions for both:
 406
 407 * rounding mode read from FPSCR
 408 * rounding mode always set to truncate
 409
 410 **Java/Saturating conversion**
 411
 412 For the sake of simplicity, the FP -> Integer conversion semantics
 413 generalized from those used by Java's semantics (and Rust's `as`
 414 operator) will be referred to as [Java/Saturating conversion
 415 semantics](#fp-to-int-java-saturating-conversion-semantics).
 416
 417 Those same semantics are used in some way by all of the following
 418 languages (not necessarily for the default conversion method):
 419
 420 * Java's
 421   [FP -> Integer conversion](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
 422   (only for long/int results)
 423 * Rust's FP -> Integer conversion using the
 424   [`as` operator](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
 425 * LLVM's
 426   [`llvm.fptosi.sat`](https://llvm.org/docs/LangRef.html#llvm-fptosi-sat-intrinsic) and
 427   [`llvm.fptoui.sat`](https://llvm.org/docs/LangRef.html#llvm-fptoui-sat-intrinsic) intrinsics
 428 * SPIR-V's OpenCL dialect's
 429   [`OpConvertFToU`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToU) and
 430   [`OpConvertFToS`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToS)
 431   instructions when decorated with
 432   [the `SaturatedConversion` decorator](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#_a_id_decoration_a_decoration).
 433 * WebAssembly has also introduced
 434  [trunc_sat_u](ttps://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-u) and
 435  [trunc_sat_s](https://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-s)
 436
 437 **JavaScript conversion**
 438
 439 For the sake of simplicity, the FP -> Integer conversion
 440 semantics generalized from those used by JavaScripts's `ToInt32`
 441 abstract operation will be referred to as [JavaScript conversion
 442 semantics](#fp-to-int-javascript-conversion-semantics).
 443
 444 This instruction is present in ARM assembler as FJCVTZS
 445 <https://developer.arm.com/documentation/dui0801/g/hko1477562192868>
 446
 447 **Rc=1 and OE=1**
 448
 449 All of these instructions have an Rc=1 mode which sets CR0
 450 in the normal way for any instructions producing a GPR result.
 451 Additionally, when OE=1, if the numerical value of the FP number
 452 is not 100% accurately preserved (due to truncation or saturation
 453 and including when the FP number was NaN) then this is considered
 454 to be an integer Overflow condition, and CR0.SO, XER.SO and XER.OV
 455 are all set as normal for any GPR instructions that overflow.
 456
 457 \newpage{}
 458
 459 ### FP to Integer Conversion Simplified Pseudo-code
 460
 461 Key for pseudo-code:
 462
 463 | term                      | result type | definition                                                                                         |
 464 |---------------------------|-------------|----------------------------------------------------------------------------------------------------|
 465 | `fp`                      | --          | `f32` or `f64` (or other types from SimpleV)                                                       |
 466 | `int`                     | --          | `u32`/`u64`/`i32`/`i64` (or other types from SimpleV)                                              |
 467 | `uint`                    | --          | the unsigned integer of the same bit-width as `int`                                                |
 468 | `int::BITS`               | `int`       | the bit-width of `int`                                                                             |
 469 | `uint::MIN_VALUE`         | `uint`      | the minimum value `uint` can store: `0`                   |
 470 | `uint::MAX_VALUE`          | `uint`       | the maximum value `uint` can store: `2^int::BITS - 1`  |
 471 | `int::MIN_VALUE`          | `int`       | the minimum value `int` can store : `-2^(int::BITS-1)`              |
 472 | `int::MAX_VALUE`          | `int`       | the maximum value `int` can store :  `2^(int::BITS-1) - 1`  |
 473 | `int::VALUE_COUNT`        | Integer     | the number of different values `int` can store (`2^int::BITS`). too big to fit in `int`.           |
 474 | `rint(fp, rounding_mode)` | `fp`        | rounds the floating-point value `fp` to an integer according to rounding mode `rounding_mode`      |
 475
 476 <div id="fp-to-int-openpower-conversion-semantics"></div>
 477 OpenPower conversion semantics (section A.2 page 1009 (page 1035) of
 478 Power ISA v3.1B):
 479
 480 ```
 481     def fp_to_int_open_power<fp, int>(v: fp) -> int:
 482         if v is NaN:
 483             return int::MIN_VALUE
 484         if v >= int::MAX_VALUE:
 485             return int::MAX_VALUE
 486         if v <= int::MIN_VALUE:
 487             return int::MIN_VALUE
 488         return (int)rint(v, rounding_mode)
 489 ```
 490
 491 <div id="fp-to-int-java-saturating-conversion-semantics"></div>
 492 [Java/Saturating conversion semantics](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
 493 (only for long/int results)/
 494 [Rust semantics](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
 495 (with adjustment to add non-truncate rounding modes):
 496
 497 ```
 498     def fp_to_int_java_saturating<fp, int>(v: fp) -> int:
 499         if v is NaN:
 500             return 0
 501         if v >= int::MAX_VALUE:
 502             return int::MAX_VALUE
 503         if v <= int::MIN_VALUE:
 504             return int::MIN_VALUE
 505         return (int)rint(v, rounding_mode)
 506 ```
 507
 508 <div id="fp-to-int-javascript-conversion-semantics"></div>
 509 Section 7.1 of the ECMAScript / JavaScript
 510 [conversion semantics](https://262.ecma-international.org/11.0/#sec-toint32)
 511 (with adjustment to add non-truncate rounding modes):
 512
 513 ```
 514     def fp_to_int_java_script<fp, int>(v: fp) -> int:
 515         if v is NaN or infinite:
 516             return 0
 517         v = rint(v, rounding_mode)  # assume no loss of precision in result
 518         v = v mod int::VALUE_COUNT  # 2^32 for i32, 2^64 for i64, result is non-negative
 519         bits = (uint)v
 520         return (int)bits
 521 ```
 522
 523
 524 ----------
 525
 526 \newpage{}
 527
 528
 529 ## Double-Precision Floating Convert To Integer In GPR
 530
 531 ```
 532     fcvttg RT, FRB, CVM, IT
 533     fcvttg. RT, FRB, CVM, IT
 534     fcvttgo RT, FRB, CVM, IT
 535     fcvttgo. RT, FRB, CVM, IT
 536 ```
 537
 538 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-29 | 30 | 31 | Form    |
 539 |-----|------|-------|-------|-------|-------|----|----|---------|
 540 | PO  | RT   | IT    | CVM   | FRB   | XO    | OE | Rc | XO-Form |
 541
 542 ```
 543     # based on xscvdpuxws
 544     reset_xflags()
 545     src <- bfp_CONVERT_FROM_BFP64((FRB))
 546
 547     switch(IT)
 548         case(0):  # Signed 32-bit
 549             range_min <- bfp_CONVERT_FROM_SI32(0x8000_0000)
 550             range_max <- bfp_CONVERT_FROM_SI32(0x7FFF_FFFF)
 551             js_mask <- 0xFFFF_FFFF
 552         case(1):  # Unsigned 32-bit
 553             range_min <- bfp_CONVERT_FROM_UI32(0)
 554             range_max <- bfp_CONVERT_FROM_UI32(0xFFFF_FFFF)
 555             js_mask <- 0xFFFF_FFFF
 556         case(2):  # Signed 64-bit
 557             range_min <- bfp_CONVERT_FROM_SI64(-0x8000_0000_0000_0000)
 558             range_max <- bfp_CONVERT_FROM_SI64(0x7FFF_FFFF_FFFF_FFFF)
 559             js_mask <- 0xFFFF_FFFF_FFFF_FFFF
 560         default:  # Unsigned 64-bit
 561             range_min <- bfp_CONVERT_FROM_UI64(0)
 562             range_max <- bfp_CONVERT_FROM_UI64(0xFFFF_FFFF_FFFF_FFFF)
 563             js_mask <- 0xFFFF_FFFF_FFFF_FFFF
 564
 565     if CVM[2] = 1 or FPSCR.RN = 0b01 then
 566         rnd <- bfp_ROUND_TO_INTEGER_TRUNC(src)
 567     else if FPSCR.RN = 0b00 then
 568         rnd <- bfp_ROUND_TO_INTEGER_NEAR_EVEN(src)
 569     else if FPSCR.RN = 0b10 then
 570         rnd <- bfp_ROUND_TO_INTEGER_CEIL(src)
 571     else if FPSCR.RN = 0b11 then
 572         rnd <- bfp_ROUND_TO_INTEGER_FLOOR(src)
 573
 574     switch(CVM)
 575         case(0, 1):  # OpenPower semantics
 576             if IsNaN(rnd) then
 577                 result <- si64_CONVERT_FROM_BFP(range_min)
 578             else if bfp_COMPARE_GT(rnd, range_max) then
 579                 result <- ui64_CONVERT_FROM_BFP(range_max)
 580             else if bfp_COMPARE_LT(rnd, range_min) then
 581                 result <- si64_CONVERT_FROM_BFP(range_min)
 582             else if IT[1] = 1 then  # Unsigned 32/64-bit
 583                 result <- ui64_CONVERT_FROM_BFP(range_max)
 584             else  # Signed 32/64-bit
 585                 result <- si64_CONVERT_FROM_BFP(range_max)
 586         case(2, 3):  # Java/Saturating semantics
 587             if IsNaN(rnd) then
 588                 result <- [0] * 64
 589             else if bfp_COMPARE_GT(rnd, range_max) then
 590                 result <- ui64_CONVERT_FROM_BFP(range_max)
 591             else if bfp_COMPARE_LT(rnd, range_min) then
 592                 result <- si64_CONVERT_FROM_BFP(range_min)
 593             else if IT[1] = 1 then  # Unsigned 32/64-bit
 594                 result <- ui64_CONVERT_FROM_BFP(range_max)
 595             else  # Signed 32/64-bit
 596                 result <- si64_CONVERT_FROM_BFP(range_max)
 597         default:  # JavaScript semantics
 598             # CVM = 6, 7 are illegal instructions
 599             # this works because the largest type we try to convert from has
 600             # 53 significand bits, and the largest type we try to convert to
 601             # has 64 bits, and the sum of those is strictly less than the 128
 602             # bits of the intermediate result.
 603             limit <- bfp_CONVERT_FROM_UI128([1] * 128)
 604             if IsInf(rnd) or IsNaN(rnd) then
 605                 result <- [0] * 64
 606             else if bfp_COMPARE_GT(bfp_ABSOLUTE(rnd), limit) then
 607                 result <- [0] * 64
 608             else
 609                 result128 <- si128_CONVERT_FROM_BFP(rnd)
 610                 result <- result128[64:127] & js_mask
 611
 612     switch(IT)
 613         case(0):  # Signed 32-bit
 614             result <- EXTS64(result[32:63])
 615             result_bfp <- bfp_CONVERT_FROM_SI32(result[32:63])
 616         case(1):  # Unsigned 32-bit
 617             result <- EXTZ64(result[32:63])
 618             result_bfp <- bfp_CONVERT_FROM_UI32(result[32:63])
 619         case(2):  # Signed 64-bit
 620             result_bfp <- bfp_CONVERT_FROM_SI64(result)
 621         default:  # Unsigned 64-bit
 622             result_bfp <- bfp_CONVERT_FROM_UI64(result)
 623
 624     if vxsnan_flag = 1 then SetFX(FPSCR.VXSNAN)
 625     if vxcvi_flag = 1 then SetFX(FPSCR.VXCVI)
 626     if xx_flag = 1 then SetFX(FPSCR.XX)
 627
 628     vx_flag <- vxsnan_flag | vxcvi_flag
 629     vex_flag <- FPSCR.VE & vx_flag
 630
 631     if vex_flag = 0 then
 632         RT <- result
 633         FPSCR.FPRF <- undefined
 634         FPSCR.FR <- inc_flag
 635         FPSCR.FI <- xx_flag
 636         if IsNaN(src) or not bfp_COMPARE_EQ(src, result_bfp) then
 637             overflow <- 1  # signals SO only when OE = 1
 638     else
 639         FPSCR.FR <- 0
 640         FPSCR.FI <- 0
 641 ```
 642
 643 Convert from 64-bit float in FRB to a unsigned/signed 32/64-bit integer
 644 in RT, with the conversion overflow/rounding semantics following the
 645 chosen `CVM` value. `FPSCR` is modified and exceptions are raised as usual.
 646
 647 These instructions have an Rc=1 mode which sets CR0 in the normal
 648 way for any instructions producing a GPR result.  Additionally, when OE=1,
 649 if the numerical value of the FP number is not 100% accurately preserved
 650 (due to truncation or saturation and including when the FP number was
 651 NaN) then this is considered to be an Integer Overflow condition, and
 652 CR0.SO, XER.SO and XER.OV are all set as normal for any GPR instructions
 653 that overflow.
 654
 655 Special Registers altered:
 656
 657     CR0              (if Rc=1)
 658     XER SO, OV, OV32 (if OE=1)
 659     FPCSR   (TODO: which bits?)
 660
 661 ### Assembly Aliases
 662
 663 | Assembly Alias            | Full Instruction           | Assembly Alias            | Full Instruction           |
 664 |---------------------------|----------------------------|---------------------------|----------------------------|
 665 | `fcvttgw RT, FRB, CVM`    | `fcvttg RT, FRB, CVM, 0`   | `fcvttgd RT, FRB, CVM`    | `fcvttg RT, FRB, CVM, 2`   |
 666 | `fcvttgw. RT, FRB, CVM`   | `fcvttg. RT, FRB, CVM, 0`  | `fcvttgd. RT, FRB, CVM`   | `fcvttg. RT, FRB, CVM, 2`  |
 667 | `fcvttgwo RT, FRB, CVM`   | `fcvttgo RT, FRB, CVM, 0`  | `fcvttgdo RT, FRB, CVM`   | `fcvttgo RT, FRB, CVM, 2`  |
 668 | `fcvttgwo. RT, FRB, CVM`  | `fcvttgo. RT, FRB, CVM, 0` | `fcvttgdo. RT, FRB, CVM`  | `fcvttgo. RT, FRB, CVM, 2` |
 669 | `fcvttguw RT, FRB, CVM`   | `fcvttg RT, FRB, CVM, 1`   | `fcvttgud RT, FRB, CVM`   | `fcvttg RT, FRB, CVM, 3`   |
 670 | `fcvttguw. RT, FRB, CVM`  | `fcvttg. RT, FRB, CVM, 1`  | `fcvttgud. RT, FRB, CVM`  | `fcvttg. RT, FRB, CVM, 3`  |
 671 | `fcvttguwo RT, FRB, CVM`  | `fcvttgo RT, FRB, CVM, 1`  | `fcvttgudo RT, FRB, CVM`  | `fcvttgo RT, FRB, CVM, 3`  |
 672 | `fcvttguwo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 1` | `fcvttgudo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 3` |
 673
 674 ----------
 675
 676 \newpage{}
 677
 678 ## Floating Convert Single To Integer In GPR
 679
 680 ```
 681     fcvtstg RT, FRB, CVM, IT
 682     fcvtstg. RT, FRB, CVM, IT
 683     fcvtstgo RT, FRB, CVM, IT
 684     fcvtstgo. RT, FRB, CVM, IT
 685 ```
 686
 687 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-29 | 30 | 31 | Form    |
 688 |-----|------|-------|-------|-------|-------|----|----|---------|
 689 | PO  | RT   | IT    | CVM   | FRB   | XO    | OE | Rc | XO-Form |
 690
 691 ```
 692     # based on xscvdpuxws
 693     reset_xflags()
 694     src <- bfp_CONVERT_FROM_BFP32(SINGLE((FRB)))
 695
 696     switch(IT)
 697         case(0):  # Signed 32-bit
 698             range_min <- bfp_CONVERT_FROM_SI32(0x8000_0000)
 699             range_max <- bfp_CONVERT_FROM_SI32(0x7FFF_FFFF)
 700             js_mask <- 0xFFFF_FFFF
 701         case(1):  # Unsigned 32-bit
 702             range_min <- bfp_CONVERT_FROM_UI32(0)
 703             range_max <- bfp_CONVERT_FROM_UI32(0xFFFF_FFFF)
 704             js_mask <- 0xFFFF_FFFF
 705         case(2):  # Signed 64-bit
 706             range_min <- bfp_CONVERT_FROM_SI64(-0x8000_0000_0000_0000)
 707             range_max <- bfp_CONVERT_FROM_SI64(0x7FFF_FFFF_FFFF_FFFF)
 708             js_mask <- 0xFFFF_FFFF_FFFF_FFFF
 709         default:  # Unsigned 64-bit
 710             range_min <- bfp_CONVERT_FROM_UI64(0)
 711             range_max <- bfp_CONVERT_FROM_UI64(0xFFFF_FFFF_FFFF_FFFF)
 712             js_mask <- 0xFFFF_FFFF_FFFF_FFFF
 713
 714     if CVM[2] = 1 or FPSCR.RN = 0b01 then
 715         rnd <- bfp_ROUND_TO_INTEGER_TRUNC(src)
 716     else if FPSCR.RN = 0b00 then
 717         rnd <- bfp_ROUND_TO_INTEGER_NEAR_EVEN(src)
 718     else if FPSCR.RN = 0b10 then
 719         rnd <- bfp_ROUND_TO_INTEGER_CEIL(src)
 720     else if FPSCR.RN = 0b11 then
 721         rnd <- bfp_ROUND_TO_INTEGER_FLOOR(src)
 722
 723     switch(CVM)
 724         case(0, 1):  # OpenPower semantics
 725             if IsNaN(rnd) then
 726                 result <- si64_CONVERT_FROM_BFP(range_min)
 727             else if bfp_COMPARE_GT(rnd, range_max) then
 728                 result <- ui64_CONVERT_FROM_BFP(range_max)
 729             else if bfp_COMPARE_LT(rnd, range_min) then
 730                 result <- si64_CONVERT_FROM_BFP(range_min)
 731             else if IT[1] = 1 then  # Unsigned 32/64-bit
 732                 result <- ui64_CONVERT_FROM_BFP(range_max)
 733             else  # Signed 32/64-bit
 734                 result <- si64_CONVERT_FROM_BFP(range_max)
 735         case(2, 3):  # Java/Saturating semantics
 736             if IsNaN(rnd) then
 737                 result <- [0] * 64
 738             else if bfp_COMPARE_GT(rnd, range_max) then
 739                 result <- ui64_CONVERT_FROM_BFP(range_max)
 740             else if bfp_COMPARE_LT(rnd, range_min) then
 741                 result <- si64_CONVERT_FROM_BFP(range_min)
 742             else if IT[1] = 1 then  # Unsigned 32/64-bit
 743                 result <- ui64_CONVERT_FROM_BFP(range_max)
 744             else  # Signed 32/64-bit
 745                 result <- si64_CONVERT_FROM_BFP(range_max)
 746         default:  # JavaScript semantics
 747             # CVM = 6, 7 are illegal instructions
 748             # this works because the largest type we try to convert from has
 749             # 53 significand bits, and the largest type we try to convert to
 750             # has 64 bits, and the sum of those is strictly less than the 128
 751             # bits of the intermediate result.
 752             limit <- bfp_CONVERT_FROM_UI128([1] * 128)
 753             if IsInf(rnd) or IsNaN(rnd) then
 754                 result <- [0] * 64
 755             else if bfp_COMPARE_GT(bfp_ABSOLUTE(rnd), limit) then
 756                 result <- [0] * 64
 757             else
 758                 result128 <- si128_CONVERT_FROM_BFP(rnd)
 759                 result <- result128[64:127] & js_mask
 760
 761     switch(IT)
 762         case(0):  # Signed 32-bit
 763             result <- EXTS64(result[32:63])
 764             result_bfp <- bfp_CONVERT_FROM_SI32(result[32:63])
 765         case(1):  # Unsigned 32-bit
 766             result <- EXTZ64(result[32:63])
 767             result_bfp <- bfp_CONVERT_FROM_UI32(result[32:63])
 768         case(2):  # Signed 64-bit
 769             result_bfp <- bfp_CONVERT_FROM_SI64(result)
 770         default:  # Unsigned 64-bit
 771             result_bfp <- bfp_CONVERT_FROM_UI64(result)
 772
 773     if vxsnan_flag = 1 then SetFX(FPSCR.VXSNAN)
 774     if vxcvi_flag = 1 then SetFX(FPSCR.VXCVI)
 775     if xx_flag = 1 then SetFX(FPSCR.XX)
 776
 777     vx_flag <- vxsnan_flag | vxcvi_flag
 778     vex_flag <- FPSCR.VE & vx_flag
 779
 780     if vex_flag = 0 then
 781         RT <- result
 782         FPSCR.FPRF <- undefined
 783         FPSCR.FR <- inc_flag
 784         FPSCR.FI <- xx_flag
 785         if IsNaN(src) or not bfp_COMPARE_EQ(src, result_bfp) then
 786             overflow <- 1  # signals SO only when OE = 1
 787     else
 788         FPSCR.FR <- 0
 789         FPSCR.FI <- 0
 790 ```
 791
 792 Convert from 32-bit float in FRB to a unsigned/signed 32/64-bit integer
 793 in RT, with the conversion overflow/rounding semantics following the
 794 chosen `CVM` value, following the usual 32-bit float in 64-bit float
 795 format. `FPSCR` is modified and exceptions are raised as usual.
 796
 797 These instructions have an Rc=1 mode which sets CR0 in the normal
 798 way for any instructions producing a GPR result.  Additionally, when OE=1,
 799 if the numerical value of the FP number is not 100% accurately preserved
 800 (due to truncation or saturation and including when the FP number was
 801 NaN) then this is considered to be an Integer Overflow condition, and
 802 CR0.SO, XER.SO and XER.OV are all set as normal for any GPR instructions
 803 that overflow.
 804
 805 Special Registers altered:
 806
 807     CR0              (if Rc=1)
 808     XER SO, OV, OV32 (if OE=1)
 809     FPCSR   (TODO: which bits?)
 810
 811 ### Assembly Aliases
 812
 813 | Assembly Alias             | Full Instruction            | Assembly Alias             | Full Instruction            |
 814 |----------------------------|-----------------------------|----------------------------|-----------------------------|
 815 | `fcvtstgw RT, FRB, CVM`    | `fcvtstg RT, FRB, CVM, 0`   | `fcvtstgd RT, FRB, CVM`    | `fcvtstg RT, FRB, CVM, 2`   |
 816 | `fcvtstgw. RT, FRB, CVM`   | `fcvtstg. RT, FRB, CVM, 0`  | `fcvtstgd. RT, FRB, CVM`   | `fcvtstg. RT, FRB, CVM, 2`  |
 817 | `fcvtstgwo RT, FRB, CVM`   | `fcvtstgo RT, FRB, CVM, 0`  | `fcvtstgdo RT, FRB, CVM`   | `fcvtstgo RT, FRB, CVM, 2`  |
 818 | `fcvtstgwo. RT, FRB, CVM`  | `fcvtstgo. RT, FRB, CVM, 0` | `fcvtstgdo. RT, FRB, CVM`  | `fcvtstgo. RT, FRB, CVM, 2` |
 819 | `fcvtstguw RT, FRB, CVM`   | `fcvtstg RT, FRB, CVM, 1`   | `fcvtstgud RT, FRB, CVM`   | `fcvtstg RT, FRB, CVM, 3`   |
 820 | `fcvtstguw. RT, FRB, CVM`  | `fcvtstg. RT, FRB, CVM, 1`  | `fcvtstgud. RT, FRB, CVM`  | `fcvtstg. RT, FRB, CVM, 3`  |
 821 | `fcvtstguwo RT, FRB, CVM`  | `fcvtstgo RT, FRB, CVM, 1`  | `fcvtstgudo RT, FRB, CVM`  | `fcvtstgo RT, FRB, CVM, 3`  |
 822 | `fcvtstguwo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 1` | `fcvtstgudo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 3` |
 823
 824 ----------
 825
 826 \newpage{}
 827
 828 ----------
 829
 830 # Appendices
 831
 832     Appendix E Power ISA sorted by opcode
 833     Appendix F Power ISA sorted by version
 834     Appendix G Power ISA sorted by Compliancy Subset
 835     Appendix H Power ISA sorted by mnemonic
 836
 837 |Form| Book | Page | Version | mnemonic | Description |
 838 |----|------|------|---------|----------|-------------|
 839 |VA  | I    | #    | 3.2B    |todo   | |
 840
 841 ----------------
 842
 843 [[!tag opf_rfc]]