openpower/sv/rfc/ls006.mdwn

   1 # RFC ls006 FPR <-> GPR Move/Conversion
   2
   3 **URLs**:
   4
   5 * <https://libre-soc.org/openpower/sv/int_fp_mv/>
   6 * <https://libre-soc.org/openpower/sv/rfc/ls006/>
   7 * <https://bugs.libre-soc.org/show_bug.cgi?id=1015>
   8 * <https://git.openpower.foundation/isa/PowerISA/issues/todo>
   9
  10 **Severity**: Major
  11
  12 **Status**: New
  13
  14 **Date**: 20 Oct 2022
  15
  16 **Target**: v3.2B
  17
  18 **Source**: v3.1B
  19
  20 **Books and Section affected**: **UPDATE**
  21
  22 * Book I 4.6.5 Floating-Point Move Instructions
  23 * Book I 4.6.7.2 Floating-Point Convert To/From Integer Instructions
  24 * Appendix E Power ISA sorted by opcode
  25 * Appendix F Power ISA sorted by version
  26 * Appendix G Power ISA sorted by Compliancy Subset
  27 * Appendix H Power ISA sorted by mnemonic
  28
  29 **Summary**
  30
  31 Single-precision Instructions added:
  32
  33 * `fmvtgs` -- Single-Precision Floating Move To GPR
  34 * `fmvfgs` -- Single-Precision Floating Move From GPR
  35 * `fcvttgs` -- Single-Precision Floating Convert To Integer In GPR
  36 * `fcvtfgs` -- Single-Precision Floating Convert From Integer In GPR
  37
  38 Identical (except Double-precision) Instructions added:
  39
  40 * `fmvtg` -- Double-Precision Floating Move To GPR
  41 * `fmvfg` -- Double-Precision Floating Move From GPR
  42 * `fcvttg` -- Double-Precision Floating Convert To Integer In GPR
  43 * `fcvtfg` -- Double-Precision Floating Convert From Integer In GPR
  44
  45 **Submitter**: Luke Leighton (Libre-SOC)
  46
  47 **Requester**: Libre-SOC
  48
  49 **Impact on processor**:
  50
  51 * Addition of four new Single-Precision GPR-FPR-based instructions
  52 * Addition of four new Double-Precision GPR-FPR-based instructions
  53
  54 **Impact on software**:
  55
  56 * Requires support for new instructions in assembler, debuggers,
  57   and related tools.
  58
  59 **Keywords**:
  60
  61 ```
  62     GPR, FPR, Move, Conversion, JavaScript
  63 ```
  64
  65 **Motivation**
  66
  67 CPUs without VSX/VMX lack a way to efficiently transfer data between
  68 FPRs and GPRs, they need to go through memory, this proposal adds more
  69 efficient data transfer (both bitwise copy and Integer <-> FP conversion)
  70 instructions that transfer directly between FPRs and GPRs without needing
  71 to go through memory.
  72
  73 IEEE 754 doesn't specify what results are obtained when converting a NaN
  74 or out-of-range floating-point value to integer, so different programming
  75 languages and ISAs have made different choices.  Below is an overview
  76 of the different variants, listing the languages and hardware that
  77 implements each variant.
  78
  79 **Notes and Observations**:
  80
  81 * These instructions are present in many other ISAs.
  82 * JavaScript rounding as one instruction saves 35 instructions including
  83   six branches. (FIXME: disagrees with int_fp_mv and int_fp_mv/appendix)
  84 * Both sets are orthogonal (no difference except being Single/Double).
  85   This allows IBM to follow the pre-existing precedent of allocating
  86   separate Major Opcodes (PO) for Double-precision and Single-precision
  87   respectively.
  88
  89 **Changes**
  90
  91 Add the following entries to:
  92
  93 * Book I 4.6.5 Floating-Point Move Instructions
  94 * Book I 4.6.7.2 Floating-Point Convert To/From Integer Instructions
  95 * Book I 1.6.1 and 1.6.2
  96
  97 ----------------
  98
  99 \newpage{}
 100
 101 # Immediate Tables
 102
 103 Tables that are used by
 104 `fmvtg[s][.]`/`fmvfg[s][.]`/`fcvt[s]tg[o][.]`/`fcvtfg[s][.]`:
 105
 106 ## `IT` -- Integer Type
 107
 108 | `IT` | Integer Type    | Assembly Alias Mnemonic |
 109 |------|-----------------|-------------------------|
 110 | 0    | Signed 32-bit   | `<op>w`                 |
 111 | 1    | Unsigned 32-bit | `<op>uw`                |
 112 | 2    | Signed 64-bit   | `<op>d`                 |
 113 | 3    | Unsigned 64-bit | `<op>ud`                |
 114
 115 ## `CVM` -- Float to Integer Conversion Mode
 116
 117 | `CVM` | `rounding_mode` | Semantics                        |
 118 |-------|-----------------|----------------------------------|
 119 | 000   | from `FPSCR`    | [OpenPower semantics]            |
 120 | 001   | Truncate        | [OpenPower semantics]            |
 121 | 010   | from `FPSCR`    | [Java/Saturating semantics]      |
 122 | 011   | Truncate        | [Java/Saturating semantics]      |
 123 | 100   | from `FPSCR`    | [JavaScript semantics]           |
 124 | 101   | Truncate        | [JavaScript semantics]           |
 125 | rest  | --              | illegal instruction trap for now |
 126
 127 [OpenPower semantics]: #fp-to-int-openpower-conversion-semantics
 128 [Java/Saturating semantics]: #fp-to-int-java-saturating-conversion-semantics
 129 [JavaScript semantics]: #fp-to-int-javascript-conversion-semantics
 130
 131 ----------
 132
 133 ## Floating Move To GPR
 134
 135 ```
 136     fmvtg RT, FRB
 137     fmvtg. RT, FRB
 138 ```
 139
 140 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form   |
 141 |-----|------|-------|-------|-------|----|--------|
 142 | PO  | RT   | 0     | FRB   | XO    | Rc | X-Form |
 143
 144 ```
 145     RT <- (FRB)
 146 ```
 147
 148 Move a 64-bit float from a FPR to a GPR, just copying bits of the IEEE 754
 149 representation directly. This is equivalent to `stfd` followed by `ld`.
 150 As `fmvtg` is just copying bits, `FPSCR` is not affected in any way.
 151
 152 Rc=1 tests RT and sets CR0, exactly like all other Scalar Fixed-Point
 153 operations.
 154
 155 Special Registers altered:
 156
 157 ```
 158     CR0     (if Rc=1)
 159 ```
 160
 161 ----------
 162
 163 ## Floating Move To GPR Single
 164
 165 ```
 166     fmvtgs RT, FRB
 167     fmvtgs. RT, FRB
 168 ```
 169
 170 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form   |
 171 |-----|------|-------|-------|-------|----|--------|
 172 | PO  | RT   | 0     | FRB   | XO    | Rc | X-Form |
 173
 174 ```
 175     RT <- [0] * 32 || SINGLE((FRB))  # SINGLE since that's what stfs uses
 176 ```
 177
 178 Move a 32-bit float from a FPR to a GPR, just copying bits of the IEEE 754
 179 representation directly. This is equivalent to `stfs` followed by `lwz`.
 180 As `fmvtgs` is just copying bits, `FPSCR` is not affected in any way.
 181
 182 Rc=1 tests RT and sets CR0, exactly like all other Scalar Fixed-Point
 183 operations.
 184
 185 Special Registers altered:
 186
 187 ```
 188     CR0     (if Rc=1)
 189 ```
 190
 191 ----------
 192
 193 \newpage{}
 194
 195 ## Double-Precision Floating Move From GPR
 196
 197 ```
 198     fmvfg FRT, RB
 199     fmvfg. FRT, RB
 200 ```
 201
 202 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form   |
 203 |-----|------|-------|-------|-------|----|--------|
 204 | PO  | FRT  | 0     | RB    | XO    | Rc | X-Form |
 205
 206 ```
 207     FRT <- (RB)
 208 ```
 209
 210 move a 64-bit float from a GPR to a FPR, just copying bits of the IEEE 754
 211 representation directly. This is equivalent to `std` followed by `lfd`.
 212 As `fmvfg` is just copying bits, `FPSCR` is not affected in any way.
 213
 214 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
 215 operations.
 216
 217 Special Registers altered:
 218
 219 ```
 220     CR1     (if Rc=1)
 221 ```
 222
 223 ----------
 224
 225 ## Floating Move From GPR Single
 226
 227 ```
 228     fmvfgs FRT, RB
 229     fmvfgs. FRT, RB
 230 ```
 231
 232 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form   |
 233 |-----|------|-------|-------|-------|----|--------|
 234 | PO  | FRT  | 0     | RB    | XO    | Rc | X-Form |
 235
 236 ```
 237     FRT <- DOUBLE((RB)[32:63])  # DOUBLE since that's what lfs uses
 238 ```
 239
 240 move a 32-bit float from a GPR to a FPR, just copying bits of the IEEE 754
 241 representation directly. This is equivalent to `stw` followed by `lfs`.
 242 As `fmvfgs` is just copying bits, `FPSCR` is not affected in any way.
 243
 244 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
 245 operations.
 246
 247 Special Registers altered:
 248
 249 ```
 250     CR1     (if Rc=1)
 251 ```
 252
 253 ----------
 254
 255 \newpage{}
 256
 257 ## Double-Precision Floating Convert From Integer In GPR
 258
 259 ```
 260     fcvtfg FRT, RB, IT
 261     fcvtfg. FRT, RB, IT
 262 ```
 263
 264 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-30 | 31 | Form   |
 265 |-----|------|-------|-------|-------|-------|----|--------|
 266 | PO  | FRT  | IT    | 0     | RB    | XO    | Rc | X-Form |
 267
 268 ```
 269     if IT[0] = 0 then  # 32-bit int -> 64-bit float
 270         # rounding never necessary, so don't touch FPSCR
 271         # based off xvcvsxwdp
 272         if IT = 0 then  # Signed 32-bit
 273             src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
 274         else  # IT = 1 -- Unsigned 32-bit
 275             src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
 276         FRT <- bfp64_CONVERT_FROM_BFP(src)
 277     else
 278         # rounding may be necessary. based off xscvuxdsp
 279         reset_xflags()
 280         switch(IT)
 281             case(0):  # Signed 32-bit
 282                 src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
 283             case(1):  # Unsigned 32-bit
 284                 src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
 285             case(2):  # Signed 64-bit
 286                 src <- bfp_CONVERT_FROM_SI64((RB))
 287             default:  # Unsigned 64-bit
 288                 src <- bfp_CONVERT_FROM_UI64((RB))
 289         rnd <- bfp_ROUND_TO_BFP64(FPSCR.RN, src)
 290         result <- bfp64_CONVERT_FROM_BFP(rnd)
 291         cls <- fprf_CLASS_BFP64(result)
 292
 293         if xx_flag = 1 then SetFX(FPSCR.XX)
 294
 295         FRT <- result
 296         FPSCR.FPRF <- cls
 297         FPSCR.FR <- inc_flag
 298         FPSCR.FI <- xx_flag
 299 ```
 300 <!-- note the PowerISA spec. explicitly has empty lines before/after SetFX,
 301 don't remove them -->
 302
 303 Convert from a unsigned/signed 32/64-bit integer in RB to a 64-bit
 304 float in FRT.
 305
 306 If converting from a unsigned/signed 32-bit integer to a 64-bit float,
 307 rounding is never necessary, so `FPSCR` is unmodified and exceptions are
 308 never raised. Otherwise, `FPSCR` is modified and exceptions are raised
 309 as usual.
 310
 311 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
 312 operations.
 313
 314 Special Registers altered:
 315
 316 ```
 317     CR1     (if Rc=1)
 318     FPCSR   (TODO: which bits?) (if IT[0]=1)
 319 ```
 320
 321 ### Assembly Aliases
 322
 323 | Assembly Alias       | Full Instruction     |&nbsp;| Assembly Alias       | Full Instruction     |
 324 |----------------------|----------------------|------|----------------------|----------------------|
 325 | `fcvtfgw FRT, RB`    | `fcvtfg FRT, RB, 0`  |&nbsp;| `fcvtfgd FRT, RB`    | `fcvtfg FRT, RB, 2`  |
 326 | `fcvtfgw. FRT, RB`   | `fcvtfg. FRT, RB, 0` |&nbsp;| `fcvtfgd. FRT, RB`   | `fcvtfg. FRT, RB, 2` |
 327 | `fcvtfguw FRT, RB`   | `fcvtfg FRT, RB, 1`  |&nbsp;| `fcvtfgud FRT, RB`   | `fcvtfg FRT, RB, 3`  |
 328 | `fcvtfguw. FRT, RB`  | `fcvtfg. FRT, RB, 1` |&nbsp;| `fcvtfgud. FRT, RB`  | `fcvtfg. FRT, RB, 3` |
 329
 330 ----------
 331
 332 \newpage{}
 333
 334 ## Floating Convert From Integer In GPR Single
 335
 336 ```
 337     fcvtfgs FRT, RB, IT
 338     fcvtfgs. FRT, RB, IT
 339 ```
 340
 341 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-30 | 31 | Form   |
 342 |-----|------|-------|-------|-------|-------|----|--------|
 343 | PO  | FRT  | IT    | 0     | RB    | XO    | Rc | X-Form |
 344
 345 ```
 346     # rounding may be necessary. based off xscvuxdsp
 347     reset_xflags()
 348     switch(IT)
 349         case(0):  # Signed 32-bit
 350             src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
 351         case(1):  # Unsigned 32-bit
 352             src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
 353         case(2):  # Signed 64-bit
 354             src <- bfp_CONVERT_FROM_SI64((RB))
 355         default:  # Unsigned 64-bit
 356             src <- bfp_CONVERT_FROM_UI64((RB))
 357     rnd <- bfp_ROUND_TO_BFP32(FPSCR.RN, src)
 358     result32 <- bfp32_CONVERT_FROM_BFP(rnd)
 359     cls <- fprf_CLASS_BFP32(result32)
 360     result <- DOUBLE(result32)
 361
 362     if xx_flag = 1 then SetFX(FPSCR.XX)
 363
 364     FRT <- result
 365     FPSCR.FPRF <- cls
 366     FPSCR.FR <- inc_flag
 367     FPSCR.FI <- xx_flag
 368 ```
 369 <!-- note the PowerISA spec. explicitly has empty lines before/after SetFX,
 370 don't remove them -->
 371
 372 Convert from a unsigned/signed 32/64-bit integer in RB to a 32-bit
 373 float in FRT, following the usual 32-bit float in 64-bit float format.
 374 `FPSCR` is modified and exceptions are raised as usual.
 375
 376 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
 377 operations.
 378
 379 Special Registers altered:
 380
 381 ```
 382     CR1     (if Rc=1)
 383     FPCSR   (TODO: which bits?)
 384 ```
 385
 386 ### Assembly Aliases
 387
 388 | Assembly Alias       | Full Instruction     |&nbsp;| Assembly Alias       | Full Instruction     |
 389 |----------------------|----------------------|------|----------------------|----------------------|
 390 | `fcvtfgws FRT, RB`   | `fcvtfg FRT, RB, 0`  |&nbsp;| `fcvtfgds FRT, RB`   | `fcvtfg FRT, RB, 2`  |
 391 | `fcvtfgws. FRT, RB`  | `fcvtfg. FRT, RB, 0` |&nbsp;| `fcvtfgds. FRT, RB`  | `fcvtfg. FRT, RB, 2` |
 392 | `fcvtfguws FRT, RB`  | `fcvtfg FRT, RB, 1`  |&nbsp;| `fcvtfguds FRT, RB`  | `fcvtfg FRT, RB, 3`  |
 393 | `fcvtfguws. FRT, RB` | `fcvtfg. FRT, RB, 1` |&nbsp;| `fcvtfguds. FRT, RB` | `fcvtfg. FRT, RB, 3` |
 394
 395 ----------
 396
 397 \newpage{}
 398
 399 ## Floating-point to Integer Conversion Overview
 400
 401 <div id="fpr-to-gpr-conversion-mode"></div>
 402
 403 IEEE 754 doesn't specify what results are obtained when converting a NaN
 404 or out-of-range floating-point value to integer, so different programming
 405 languages and ISAs have made different choices.  Below is an overview
 406 of the different variants, listing the languages and hardware that
 407 implements each variant.
 408
 409 For convenience, we will give those different conversion semantics names
 410 based on which common ISA or programming language uses them, since there
 411 may not be an established name for them:
 412
 413 **Standard OpenPower conversion**
 414
 415 This conversion performs "saturation with NaN converted to minimum
 416 valid integer". This is also exactly the same as the x86 ISA conversion
 417 semantics.  OpenPOWER however has instructions for both:
 418
 419 * rounding mode read from FPSCR
 420 * rounding mode always set to truncate
 421
 422 **Java/Saturating conversion**
 423
 424 For the sake of simplicity, the FP -> Integer conversion semantics
 425 generalized from those used by Java's semantics (and Rust's `as`
 426 operator) will be referred to as [Java/Saturating conversion
 427 semantics](#fp-to-int-java-saturating-conversion-semantics).
 428
 429 Those same semantics are used in some way by all of the following
 430 languages (not necessarily for the default conversion method):
 431
 432 * Java's
 433   [FP -> Integer conversion](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
 434   (only for long/int results)
 435 * Rust's FP -> Integer conversion using the
 436   [`as` operator](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
 437 * LLVM's
 438   [`llvm.fptosi.sat`](https://llvm.org/docs/LangRef.html#llvm-fptosi-sat-intrinsic) and
 439   [`llvm.fptoui.sat`](https://llvm.org/docs/LangRef.html#llvm-fptoui-sat-intrinsic) intrinsics
 440 * SPIR-V's OpenCL dialect's
 441   [`OpConvertFToU`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToU) and
 442   [`OpConvertFToS`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToS)
 443   instructions when decorated with
 444   [the `SaturatedConversion` decorator](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#_a_id_decoration_a_decoration).
 445 * WebAssembly has also introduced
 446  [trunc_sat_u](ttps://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-u) and
 447  [trunc_sat_s](https://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-s)
 448
 449 **JavaScript conversion**
 450
 451 For the sake of simplicity, the FP -> Integer conversion
 452 semantics generalized from those used by JavaScripts's `ToInt32`
 453 abstract operation will be referred to as [JavaScript conversion
 454 semantics](#fp-to-int-javascript-conversion-semantics).
 455
 456 This instruction is present in ARM assembler as FJCVTZS
 457 <https://developer.arm.com/documentation/dui0801/g/hko1477562192868>
 458
 459 **Rc=1 and OE=1**
 460
 461 All of these instructions have an Rc=1 mode which sets CR0
 462 in the normal way for any instructions producing a GPR result.
 463 Additionally, when OE=1, if the numerical value of the FP number
 464 is not 100% accurately preserved (due to truncation or saturation
 465 and including when the FP number was NaN) then this is considered
 466 to be an integer Overflow condition, and CR0.SO, XER.SO and XER.OV
 467 are all set as normal for any GPR instructions that overflow.
 468
 469 \newpage{}
 470
 471 ### FP to Integer Conversion Simplified Pseudo-code
 472
 473 Key for pseudo-code:
 474
 475 | term                      | result type | definition                                                                                         |
 476 |---------------------------|-------------|----------------------------------------------------------------------------------------------------|
 477 | `fp`                      | --          | `f32` or `f64` (or other types from SimpleV)                                                       |
 478 | `int`                     | --          | `u32`/`u64`/`i32`/`i64` (or other types from SimpleV)                                              |
 479 | `uint`                    | --          | the unsigned integer of the same bit-width as `int`                                                |
 480 | `int::BITS`               | `int`       | the bit-width of `int`                                                                             |
 481 | `uint::MIN_VALUE`         | `uint`      | the minimum value `uint` can store: `0`                   |
 482 | `uint::MAX_VALUE`          | `uint`       | the maximum value `uint` can store: `2^int::BITS - 1`  |
 483 | `int::MIN_VALUE`          | `int`       | the minimum value `int` can store : `-2^(int::BITS-1)`              |
 484 | `int::MAX_VALUE`          | `int`       | the maximum value `int` can store :  `2^(int::BITS-1) - 1`  |
 485 | `int::VALUE_COUNT`        | Integer     | the number of different values `int` can store (`2^int::BITS`). too big to fit in `int`.           |
 486 | `rint(fp, rounding_mode)` | `fp`        | rounds the floating-point value `fp` to an integer according to rounding mode `rounding_mode`      |
 487
 488 <div id="fp-to-int-openpower-conversion-semantics"></div>
 489 OpenPower conversion semantics (section A.2 page 1009 (page 1035) of
 490 Power ISA v3.1B):
 491
 492 ```
 493     def fp_to_int_open_power<fp, int>(v: fp) -> int:
 494         if v is NaN:
 495             return int::MIN_VALUE
 496         if v >= int::MAX_VALUE:
 497             return int::MAX_VALUE
 498         if v <= int::MIN_VALUE:
 499             return int::MIN_VALUE
 500         return (int)rint(v, rounding_mode)
 501 ```
 502
 503 <div id="fp-to-int-java-saturating-conversion-semantics"></div>
 504 [Java/Saturating conversion semantics](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
 505 (only for long/int results)
 506 (with adjustment to add non-truncate rounding modes):
 507
 508 ```
 509     def fp_to_int_java_saturating<fp, int>(v: fp) -> int:
 510         if v is NaN:
 511             return 0
 512         if v >= int::MAX_VALUE:
 513             return int::MAX_VALUE
 514         if v <= int::MIN_VALUE:
 515             return int::MIN_VALUE
 516         return (int)rint(v, rounding_mode)
 517 ```
 518
 519 <div id="fp-to-int-javascript-conversion-semantics"></div>
 520 Section 7.1 of the ECMAScript / JavaScript
 521 [conversion semantics](https://262.ecma-international.org/11.0/#sec-toint32)
 522 (with adjustment to add non-truncate rounding modes):
 523
 524 ```
 525     def fp_to_int_java_script<fp, int>(v: fp) -> int:
 526         if v is NaN or infinite:
 527             return 0
 528         v = rint(v, rounding_mode)  # assume no loss of precision in result
 529         v = v mod int::VALUE_COUNT  # 2^32 for i32, 2^64 for i64, result is non-negative
 530         bits = (uint)v
 531         return (int)bits
 532 ```
 533
 534
 535 ----------
 536
 537 \newpage{}
 538
 539
 540 ## Double-Precision Floating Convert To Integer In GPR
 541
 542 ```
 543     fcvttg RT, FRB, CVM, IT
 544     fcvttg. RT, FRB, CVM, IT
 545     fcvttgo RT, FRB, CVM, IT
 546     fcvttgo. RT, FRB, CVM, IT
 547 ```
 548
 549 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-29 | 30 | 31 | Form    |
 550 |-----|------|-------|-------|-------|-------|----|----|---------|
 551 | PO  | RT   | IT    | CVM   | FRB   | XO    | OE | Rc | XO-Form |
 552
 553 ```
 554     # based on xscvdpuxws
 555     reset_xflags()
 556     src <- bfp_CONVERT_FROM_BFP64((FRB))
 557
 558     switch(IT)
 559         case(0):  # Signed 32-bit
 560             range_min <- bfp_CONVERT_FROM_SI32(0x8000_0000)
 561             range_max <- bfp_CONVERT_FROM_SI32(0x7FFF_FFFF)
 562             js_mask <- 0xFFFF_FFFF
 563         case(1):  # Unsigned 32-bit
 564             range_min <- bfp_CONVERT_FROM_UI32(0)
 565             range_max <- bfp_CONVERT_FROM_UI32(0xFFFF_FFFF)
 566             js_mask <- 0xFFFF_FFFF
 567         case(2):  # Signed 64-bit
 568             range_min <- bfp_CONVERT_FROM_SI64(-0x8000_0000_0000_0000)
 569             range_max <- bfp_CONVERT_FROM_SI64(0x7FFF_FFFF_FFFF_FFFF)
 570             js_mask <- 0xFFFF_FFFF_FFFF_FFFF
 571         default:  # Unsigned 64-bit
 572             range_min <- bfp_CONVERT_FROM_UI64(0)
 573             range_max <- bfp_CONVERT_FROM_UI64(0xFFFF_FFFF_FFFF_FFFF)
 574             js_mask <- 0xFFFF_FFFF_FFFF_FFFF
 575
 576     if CVM[2] = 1 or FPSCR.RN = 0b01 then
 577         rnd <- bfp_ROUND_TO_INTEGER_TRUNC(src)
 578     else if FPSCR.RN = 0b00 then
 579         rnd <- bfp_ROUND_TO_INTEGER_NEAR_EVEN(src)
 580     else if FPSCR.RN = 0b10 then
 581         rnd <- bfp_ROUND_TO_INTEGER_CEIL(src)
 582     else if FPSCR.RN = 0b11 then
 583         rnd <- bfp_ROUND_TO_INTEGER_FLOOR(src)
 584
 585     switch(CVM)
 586         case(0, 1):  # OpenPower semantics
 587             if IsNaN(rnd) then
 588                 result <- si64_CONVERT_FROM_BFP(range_min)
 589             else if bfp_COMPARE_GT(rnd, range_max) then
 590                 result <- ui64_CONVERT_FROM_BFP(range_max)
 591             else if bfp_COMPARE_LT(rnd, range_min) then
 592                 result <- si64_CONVERT_FROM_BFP(range_min)
 593             else if IT[1] = 1 then  # Unsigned 32/64-bit
 594                 result <- ui64_CONVERT_FROM_BFP(range_max)
 595             else  # Signed 32/64-bit
 596                 result <- si64_CONVERT_FROM_BFP(range_max)
 597         case(2, 3):  # Java/Saturating semantics
 598             if IsNaN(rnd) then
 599                 result <- [0] * 64
 600             else if bfp_COMPARE_GT(rnd, range_max) then
 601                 result <- ui64_CONVERT_FROM_BFP(range_max)
 602             else if bfp_COMPARE_LT(rnd, range_min) then
 603                 result <- si64_CONVERT_FROM_BFP(range_min)
 604             else if IT[1] = 1 then  # Unsigned 32/64-bit
 605                 result <- ui64_CONVERT_FROM_BFP(range_max)
 606             else  # Signed 32/64-bit
 607                 result <- si64_CONVERT_FROM_BFP(range_max)
 608         default:  # JavaScript semantics
 609             # CVM = 6, 7 are illegal instructions
 610             # this works because the largest type we try to convert from has
 611             # 53 significand bits, and the largest type we try to convert to
 612             # has 64 bits, and the sum of those is strictly less than the 128
 613             # bits of the intermediate result.
 614             limit <- bfp_CONVERT_FROM_UI128([1] * 128)
 615             if IsInf(rnd) or IsNaN(rnd) then
 616                 result <- [0] * 64
 617             else if bfp_COMPARE_GT(bfp_ABSOLUTE(rnd), limit) then
 618                 result <- [0] * 64
 619             else
 620                 result128 <- si128_CONVERT_FROM_BFP(rnd)
 621                 result <- result128[64:127] & js_mask
 622
 623     switch(IT)
 624         case(0):  # Signed 32-bit
 625             result <- EXTS64(result[32:63])
 626             result_bfp <- bfp_CONVERT_FROM_SI32(result[32:63])
 627         case(1):  # Unsigned 32-bit
 628             result <- EXTZ64(result[32:63])
 629             result_bfp <- bfp_CONVERT_FROM_UI32(result[32:63])
 630         case(2):  # Signed 64-bit
 631             result_bfp <- bfp_CONVERT_FROM_SI64(result)
 632         default:  # Unsigned 64-bit
 633             result_bfp <- bfp_CONVERT_FROM_UI64(result)
 634
 635     if vxsnan_flag = 1 then SetFX(FPSCR.VXSNAN)
 636     if vxcvi_flag = 1 then SetFX(FPSCR.VXCVI)
 637     if xx_flag = 1 then SetFX(FPSCR.XX)
 638
 639     vx_flag <- vxsnan_flag | vxcvi_flag
 640     vex_flag <- FPSCR.VE & vx_flag
 641
 642     if vex_flag = 0 then
 643         RT <- result
 644         FPSCR.FPRF <- undefined
 645         FPSCR.FR <- inc_flag
 646         FPSCR.FI <- xx_flag
 647         if IsNaN(src) or not bfp_COMPARE_EQ(src, result_bfp) then
 648             overflow <- 1  # signals SO only when OE = 1
 649     else
 650         FPSCR.FR <- 0
 651         FPSCR.FI <- 0
 652 ```
 653
 654 Convert from 64-bit float in FRB to a unsigned/signed 32/64-bit integer
 655 in RT, with the conversion overflow/rounding semantics following the
 656 chosen `CVM` value. `FPSCR` is modified and exceptions are raised as usual.
 657
 658 These instructions have an Rc=1 mode which sets CR0 in the normal
 659 way for any instructions producing a GPR result.  Additionally, when OE=1,
 660 if the numerical value of the FP number is not 100% accurately preserved
 661 (due to truncation or saturation and including when the FP number was
 662 NaN) then this is considered to be an Integer Overflow condition, and
 663 CR0.SO, XER.SO and XER.OV are all set as normal for any GPR instructions
 664 that overflow.
 665
 666 Special Registers altered:
 667
 668 ```
 669     CR0              (if Rc=1)
 670     XER SO, OV, OV32 (if OE=1)
 671     FPCSR   (TODO: which bits?)
 672 ```
 673
 674 ### Assembly Aliases
 675
 676 | Assembly Alias            | Full Instruction           | Assembly Alias            | Full Instruction           |
 677 |---------------------------|----------------------------|---------------------------|----------------------------|
 678 | `fcvttgw RT, FRB, CVM`    | `fcvttg RT, FRB, CVM, 0`   | `fcvttgd RT, FRB, CVM`    | `fcvttg RT, FRB, CVM, 2`   |
 679 | `fcvttgw. RT, FRB, CVM`   | `fcvttg. RT, FRB, CVM, 0`  | `fcvttgd. RT, FRB, CVM`   | `fcvttg. RT, FRB, CVM, 2`  |
 680 | `fcvttgwo RT, FRB, CVM`   | `fcvttgo RT, FRB, CVM, 0`  | `fcvttgdo RT, FRB, CVM`   | `fcvttgo RT, FRB, CVM, 2`  |
 681 | `fcvttgwo. RT, FRB, CVM`  | `fcvttgo. RT, FRB, CVM, 0` | `fcvttgdo. RT, FRB, CVM`  | `fcvttgo. RT, FRB, CVM, 2` |
 682 | `fcvttguw RT, FRB, CVM`   | `fcvttg RT, FRB, CVM, 1`   | `fcvttgud RT, FRB, CVM`   | `fcvttg RT, FRB, CVM, 3`   |
 683 | `fcvttguw. RT, FRB, CVM`  | `fcvttg. RT, FRB, CVM, 1`  | `fcvttgud. RT, FRB, CVM`  | `fcvttg. RT, FRB, CVM, 3`  |
 684 | `fcvttguwo RT, FRB, CVM`  | `fcvttgo RT, FRB, CVM, 1`  | `fcvttgudo RT, FRB, CVM`  | `fcvttgo RT, FRB, CVM, 3`  |
 685 | `fcvttguwo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 1` | `fcvttgudo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 3` |
 686
 687 ----------
 688
 689 \newpage{}
 690
 691 ## Floating Convert Single To Integer In GPR
 692
 693 ```
 694     fcvtstg RT, FRB, CVM, IT
 695     fcvtstg. RT, FRB, CVM, IT
 696     fcvtstgo RT, FRB, CVM, IT
 697     fcvtstgo. RT, FRB, CVM, IT
 698 ```
 699
 700 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-29 | 30 | 31 | Form    |
 701 |-----|------|-------|-------|-------|-------|----|----|---------|
 702 | PO  | RT   | IT    | CVM   | FRB   | XO    | OE | Rc | XO-Form |
 703
 704 ```
 705     # based on xscvdpuxws
 706     reset_xflags()
 707     src <- bfp_CONVERT_FROM_BFP32(SINGLE((FRB)))
 708
 709     switch(IT)
 710         case(0):  # Signed 32-bit
 711             range_min <- bfp_CONVERT_FROM_SI32(0x8000_0000)
 712             range_max <- bfp_CONVERT_FROM_SI32(0x7FFF_FFFF)
 713             js_mask <- 0xFFFF_FFFF
 714         case(1):  # Unsigned 32-bit
 715             range_min <- bfp_CONVERT_FROM_UI32(0)
 716             range_max <- bfp_CONVERT_FROM_UI32(0xFFFF_FFFF)
 717             js_mask <- 0xFFFF_FFFF
 718         case(2):  # Signed 64-bit
 719             range_min <- bfp_CONVERT_FROM_SI64(-0x8000_0000_0000_0000)
 720             range_max <- bfp_CONVERT_FROM_SI64(0x7FFF_FFFF_FFFF_FFFF)
 721             js_mask <- 0xFFFF_FFFF_FFFF_FFFF
 722         default:  # Unsigned 64-bit
 723             range_min <- bfp_CONVERT_FROM_UI64(0)
 724             range_max <- bfp_CONVERT_FROM_UI64(0xFFFF_FFFF_FFFF_FFFF)
 725             js_mask <- 0xFFFF_FFFF_FFFF_FFFF
 726
 727     if CVM[2] = 1 or FPSCR.RN = 0b01 then
 728         rnd <- bfp_ROUND_TO_INTEGER_TRUNC(src)
 729     else if FPSCR.RN = 0b00 then
 730         rnd <- bfp_ROUND_TO_INTEGER_NEAR_EVEN(src)
 731     else if FPSCR.RN = 0b10 then
 732         rnd <- bfp_ROUND_TO_INTEGER_CEIL(src)
 733     else if FPSCR.RN = 0b11 then
 734         rnd <- bfp_ROUND_TO_INTEGER_FLOOR(src)
 735
 736     switch(CVM)
 737         case(0, 1):  # OpenPower semantics
 738             if IsNaN(rnd) then
 739                 result <- si64_CONVERT_FROM_BFP(range_min)
 740             else if bfp_COMPARE_GT(rnd, range_max) then
 741                 result <- ui64_CONVERT_FROM_BFP(range_max)
 742             else if bfp_COMPARE_LT(rnd, range_min) then
 743                 result <- si64_CONVERT_FROM_BFP(range_min)
 744             else if IT[1] = 1 then  # Unsigned 32/64-bit
 745                 result <- ui64_CONVERT_FROM_BFP(range_max)
 746             else  # Signed 32/64-bit
 747                 result <- si64_CONVERT_FROM_BFP(range_max)
 748         case(2, 3):  # Java/Saturating semantics
 749             if IsNaN(rnd) then
 750                 result <- [0] * 64
 751             else if bfp_COMPARE_GT(rnd, range_max) then
 752                 result <- ui64_CONVERT_FROM_BFP(range_max)
 753             else if bfp_COMPARE_LT(rnd, range_min) then
 754                 result <- si64_CONVERT_FROM_BFP(range_min)
 755             else if IT[1] = 1 then  # Unsigned 32/64-bit
 756                 result <- ui64_CONVERT_FROM_BFP(range_max)
 757             else  # Signed 32/64-bit
 758                 result <- si64_CONVERT_FROM_BFP(range_max)
 759         default:  # JavaScript semantics
 760             # CVM = 6, 7 are illegal instructions
 761             # this works because the largest type we try to convert from has
 762             # 53 significand bits, and the largest type we try to convert to
 763             # has 64 bits, and the sum of those is strictly less than the 128
 764             # bits of the intermediate result.
 765             limit <- bfp_CONVERT_FROM_UI128([1] * 128)
 766             if IsInf(rnd) or IsNaN(rnd) then
 767                 result <- [0] * 64
 768             else if bfp_COMPARE_GT(bfp_ABSOLUTE(rnd), limit) then
 769                 result <- [0] * 64
 770             else
 771                 result128 <- si128_CONVERT_FROM_BFP(rnd)
 772                 result <- result128[64:127] & js_mask
 773
 774     switch(IT)
 775         case(0):  # Signed 32-bit
 776             result <- EXTS64(result[32:63])
 777             result_bfp <- bfp_CONVERT_FROM_SI32(result[32:63])
 778         case(1):  # Unsigned 32-bit
 779             result <- EXTZ64(result[32:63])
 780             result_bfp <- bfp_CONVERT_FROM_UI32(result[32:63])
 781         case(2):  # Signed 64-bit
 782             result_bfp <- bfp_CONVERT_FROM_SI64(result)
 783         default:  # Unsigned 64-bit
 784             result_bfp <- bfp_CONVERT_FROM_UI64(result)
 785
 786     if vxsnan_flag = 1 then SetFX(FPSCR.VXSNAN)
 787     if vxcvi_flag = 1 then SetFX(FPSCR.VXCVI)
 788     if xx_flag = 1 then SetFX(FPSCR.XX)
 789
 790     vx_flag <- vxsnan_flag | vxcvi_flag
 791     vex_flag <- FPSCR.VE & vx_flag
 792
 793     if vex_flag = 0 then
 794         RT <- result
 795         FPSCR.FPRF <- undefined
 796         FPSCR.FR <- inc_flag
 797         FPSCR.FI <- xx_flag
 798         if IsNaN(src) or not bfp_COMPARE_EQ(src, result_bfp) then
 799             overflow <- 1  # signals SO only when OE = 1
 800     else
 801         FPSCR.FR <- 0
 802         FPSCR.FI <- 0
 803 ```
 804
 805 Convert from 32-bit float in FRB to a unsigned/signed 32/64-bit integer
 806 in RT, with the conversion overflow/rounding semantics following the
 807 chosen `CVM` value, following the usual 32-bit float in 64-bit float
 808 format. `FPSCR` is modified and exceptions are raised as usual.
 809
 810 These instructions have an Rc=1 mode which sets CR0 in the normal
 811 way for any instructions producing a GPR result.  Additionally, when OE=1,
 812 if the numerical value of the FP number is not 100% accurately preserved
 813 (due to truncation or saturation and including when the FP number was
 814 NaN) then this is considered to be an Integer Overflow condition, and
 815 CR0.SO, XER.SO and XER.OV are all set as normal for any GPR instructions
 816 that overflow.
 817
 818 Special Registers altered:
 819
 820 ```
 821     CR0              (if Rc=1)
 822     XER SO, OV, OV32 (if OE=1)
 823     FPCSR   (TODO: which bits?)
 824 ```
 825
 826 ### Assembly Aliases
 827
 828 | Assembly Alias             | Full Instruction            | Assembly Alias             | Full Instruction            |
 829 |----------------------------|-----------------------------|----------------------------|-----------------------------|
 830 | `fcvtstgw RT, FRB, CVM`    | `fcvtstg RT, FRB, CVM, 0`   | `fcvtstgd RT, FRB, CVM`    | `fcvtstg RT, FRB, CVM, 2`   |
 831 | `fcvtstgw. RT, FRB, CVM`   | `fcvtstg. RT, FRB, CVM, 0`  | `fcvtstgd. RT, FRB, CVM`   | `fcvtstg. RT, FRB, CVM, 2`  |
 832 | `fcvtstgwo RT, FRB, CVM`   | `fcvtstgo RT, FRB, CVM, 0`  | `fcvtstgdo RT, FRB, CVM`   | `fcvtstgo RT, FRB, CVM, 2`  |
 833 | `fcvtstgwo. RT, FRB, CVM`  | `fcvtstgo. RT, FRB, CVM, 0` | `fcvtstgdo. RT, FRB, CVM`  | `fcvtstgo. RT, FRB, CVM, 2` |
 834 | `fcvtstguw RT, FRB, CVM`   | `fcvtstg RT, FRB, CVM, 1`   | `fcvtstgud RT, FRB, CVM`   | `fcvtstg RT, FRB, CVM, 3`   |
 835 | `fcvtstguw. RT, FRB, CVM`  | `fcvtstg. RT, FRB, CVM, 1`  | `fcvtstgud. RT, FRB, CVM`  | `fcvtstg. RT, FRB, CVM, 3`  |
 836 | `fcvtstguwo RT, FRB, CVM`  | `fcvtstgo RT, FRB, CVM, 1`  | `fcvtstgudo RT, FRB, CVM`  | `fcvtstgo RT, FRB, CVM, 3`  |
 837 | `fcvtstguwo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 1` | `fcvtstgudo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 3` |
 838
 839 ----------
 840
 841 \newpage{}
 842
 843 ----------
 844
 845 # Appendices
 846
 847     Appendix E Power ISA sorted by opcode
 848     Appendix F Power ISA sorted by version
 849     Appendix G Power ISA sorted by Compliancy Subset
 850     Appendix H Power ISA sorted by mnemonic
 851
 852 |Form| Book | Page | Version | mnemonic | Description |
 853 |----|------|------|---------|----------|-------------|
 854 |VA  | I    | #    | 3.2B    |todo   | |
 855
 856 ----------------
 857
 858 [[!tag opf_rfc]]