34606f77d29badd37072298a81706f6070cc0a16
[libreriscv.git] / openpower / sv / rfc / ls006.mdwn
1 # RFC ls006 FPR <-> GPR Move/Conversion
2
3 **URLs**:
4
5 * <https://libre-soc.org/openpower/sv/int_fp_mv/>
6 * <https://libre-soc.org/openpower/sv/rfc/ls006/>
7 * <https://bugs.libre-soc.org/show_bug.cgi?id=1015>
8 * <https://git.openpower.foundation/isa/PowerISA/issues/todo>
9
10 **Severity**: Major
11
12 **Status**: New
13
14 **Date**: 20 Oct 2022
15
16 **Target**: v3.2B
17
18 **Source**: v3.1B
19
20 **Books and Section affected**: **UPDATE**
21
22 * Book I 4.6.5 Floating-Point Move Instructions
23 * Book I 4.6.7.2 Floating-Point Convert To/From Integer Instructions
24 * Appendix E Power ISA sorted by opcode
25 * Appendix F Power ISA sorted by version
26 * Appendix G Power ISA sorted by Compliancy Subset
27 * Appendix H Power ISA sorted by mnemonic
28
29 **Summary**
30
31 Single-precision Instructions added:
32
33 * `fmvtgs` -- Single-Precision Floating Move To GPR
34 * `fmvfgs` -- Single-Precision Floating Move From GPR
35 * `fcvttgs` -- Single-Precision Floating Convert To Integer In GPR
36 * `fcvtfgs` -- Single-Precision Floating Convert From Integer In GPR
37
38 Identical (except Double-precision) Instructions added:
39
40 * `fmvtg` -- Double-Precision Floating Move To GPR
41 * `fmvfg` -- Double-Precision Floating Move From GPR
42 * `fcvttg` -- Double-Precision Floating Convert To Integer In GPR
43 * `fcvtfg` -- Double-Precision Floating Convert From Integer In GPR
44
45 **Submitter**: Luke Leighton (Libre-SOC)
46
47 **Requester**: Libre-SOC
48
49 **Impact on processor**:
50
51 * Addition of four new Single-Precision GPR-FPR-based instructions
52 * Addition of four new Double-Precision GPR-FPR-based instructions
53
54 **Impact on software**:
55
56 * Requires support for new instructions in assembler, debuggers,
57 and related tools.
58
59 **Keywords**:
60
61 ```
62 GPR, FPR, Move, Conversion, JavaScript
63 ```
64
65 **Motivation**
66
67 CPUs without VSX/VMX lack a way to efficiently transfer data between
68 FPRs and GPRs, they need to go through memory, this proposal adds more
69 efficient data transfer (both bitwise copy and Integer <-> FP conversion)
70 instructions that transfer directly between FPRs and GPRs without needing
71 to go through memory.
72
73 IEEE 754 doesn't specify what results are obtained when converting a NaN
74 or out-of-range floating-point value to integer, so different programming
75 languages and ISAs have made different choices. Below is an overview
76 of the different variants, listing the languages and hardware that
77 implements each variant.
78
79 **Notes and Observations**:
80
81 * These instructions are present in many other ISAs.
82 * JavaScript rounding as one instruction saves 35 instructions including
83 six branches. (FIXME: disagrees with int_fp_mv and int_fp_mv/appendix)
84 * Both sets are orthogonal (no difference except being Single/Double).
85 This allows IBM to follow the pre-existing precedent of allocating
86 separate Major Opcodes (PO) for Double-precision and Single-precision
87 respectively.
88
89 **Changes**
90
91 Add the following entries to:
92
93 * Book I 4.6.5 Floating-Point Move Instructions
94 * Book I 4.6.7.2 Floating-Point Convert To/From Integer Instructions
95 * Book I 1.6.1 and 1.6.2
96
97 ----------------
98
99 \newpage{}
100
101 # Immediate Tables
102
103 Tables that are used by
104 `fmvtg[s][.]`/`fmvfg[s][.]`/`fcvt[s]tg[o][.]`/`fcvtfg[s][.]`:
105
106 ## `IT` -- Integer Type
107
108 | `IT` | Integer Type | Assembly Alias Mnemonic |
109 |------|-----------------|-------------------------|
110 | 0 | Signed 32-bit | `<op>w` |
111 | 1 | Unsigned 32-bit | `<op>uw` |
112 | 2 | Signed 64-bit | `<op>d` |
113 | 3 | Unsigned 64-bit | `<op>ud` |
114
115 ## `CVM` -- Float to Integer Conversion Mode
116
117 | `CVM` | `rounding_mode` | Semantics |
118 |-------|-----------------|----------------------------------|
119 | 000 | from `FPSCR` | [OpenPower semantics] |
120 | 001 | Truncate | [OpenPower semantics] |
121 | 010 | from `FPSCR` | [Java/Saturating semantics] |
122 | 011 | Truncate | [Java/Saturating semantics] |
123 | 100 | from `FPSCR` | [JavaScript semantics] |
124 | 101 | Truncate | [JavaScript semantics] |
125 | rest | -- | illegal instruction trap for now |
126
127 [OpenPower semantics]: #fp-to-int-openpower-conversion-semantics
128 [Java/Saturating semantics]: #fp-to-int-java-saturating-conversion-semantics
129 [JavaScript semantics]: #fp-to-int-javascript-conversion-semantics
130
131 ----------
132
133 \newpage{}
134
135 ## Floating Move To GPR
136
137 ```
138 fmvtg RT, FRB
139 fmvtg. RT, FRB
140 ```
141
142 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form |
143 |-----|------|-------|-------|-------|----|--------|
144 | PO | RT | 0 | FRB | XO | Rc | X-Form |
145
146 ```
147 RT <- (FRB)
148 ```
149
150 Move a 64-bit float from a FPR to a GPR, just copying bits of the IEEE 754
151 representation directly. This is equivalent to `stfd` followed by `ld`.
152 As `fmvtg` is just copying bits, `FPSCR` is not affected in any way.
153
154 Rc=1 tests RT and sets CR0, exactly like all other Scalar Fixed-Point
155 operations.
156
157 Special Registers altered:
158
159 CR0 (if Rc=1)
160
161 ----------
162
163 \newpage{}
164
165 ## Floating Move To GPR Single
166
167 ```
168 fmvtgs RT, FRB
169 fmvtgs. RT, FRB
170 ```
171
172 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form |
173 |-----|------|-------|-------|-------|----|--------|
174 | PO | RT | 0 | FRB | XO | Rc | X-Form |
175
176 ```
177 RT <- [0] * 32 || SINGLE((FRB)) # SINGLE since that's what stfs uses
178 ```
179
180 Move a 32-bit float from a FPR to a GPR, just copying bits of the IEEE 754
181 representation directly. This is equivalent to `stfs` followed by `lwz`.
182 As `fmvtgs` is just copying bits, `FPSCR` is not affected in any way.
183
184 Rc=1 tests RT and sets CR0, exactly like all other Scalar Fixed-Point
185 operations.
186
187 Special Registers altered:
188
189 CR0 (if Rc=1)
190
191 ----------
192
193 \newpage{}
194
195 ## Double-Precision Floating Move From GPR
196
197 ```
198 fmvfg FRT, RB
199 fmvfg. FRT, RB
200 ```
201
202 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form |
203 |-----|------|-------|-------|-------|----|--------|
204 | PO | FRT | 0 | RB | XO | Rc | X-Form |
205
206 ```
207 FRT <- (RB)
208 ```
209
210 move a 64-bit float from a GPR to a FPR, just copying bits of the IEEE 754
211 representation directly. This is equivalent to `std` followed by `lfd`.
212 As `fmvfg` is just copying bits, `FPSCR` is not affected in any way.
213
214 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
215 operations.
216
217 Special Registers altered:
218
219 CR1 (if Rc=1)
220
221 ----------
222
223 \newpage{}
224
225 ## Floating Move From GPR Single
226
227 ```
228 fmvfgs FRT, RB
229 fmvfgs. FRT, RB
230 ```
231
232 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form |
233 |-----|------|-------|-------|-------|----|--------|
234 | PO | FRT | 0 | RB | XO | Rc | X-Form |
235
236 ```
237 FRT <- DOUBLE((RB)[32:63]) # DOUBLE since that's what lfs uses
238 ```
239
240 move a 32-bit float from a GPR to a FPR, just copying bits of the IEEE 754
241 representation directly. This is equivalent to `stw` followed by `lfs`.
242 As `fmvfgs` is just copying bits, `FPSCR` is not affected in any way.
243
244 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
245 operations.
246
247 Special Registers altered:
248
249 CR1 (if Rc=1)
250
251 ----------
252
253 \newpage{}
254
255 ## Double-Precision Floating Convert From Integer In GPR
256
257 ```
258 fcvtfg FRT, RB, IT
259 fcvtfg. FRT, RB, IT
260 ```
261
262 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-30 | 31 | Form |
263 |-----|------|-------|-------|-------|-------|----|--------|
264 | PO | FRT | IT | 0 | RB | XO | Rc | X-Form |
265
266 ```
267 if IT[0] = 0 then # 32-bit int -> 64-bit float
268 # rounding never necessary, so don't touch FPSCR
269 # based off xvcvsxwdp
270 if IT = 0 then # Signed 32-bit
271 src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
272 else # IT = 1 -- Unsigned 32-bit
273 src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
274 FRT <- bfp64_CONVERT_FROM_BFP(src)
275 else
276 # rounding may be necessary. based off xscvuxdsp
277 reset_xflags()
278 switch(IT)
279 case(0): # Signed 32-bit
280 src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
281 case(1): # Unsigned 32-bit
282 src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
283 case(2): # Signed 64-bit
284 src <- bfp_CONVERT_FROM_SI64((RB))
285 default: # Unsigned 64-bit
286 src <- bfp_CONVERT_FROM_UI64((RB))
287 rnd <- bfp_ROUND_TO_BFP64(FPSCR.RN, src)
288 result <- bfp64_CONVERT_FROM_BFP(rnd)
289 cls <- fprf_CLASS_BFP64(result)
290
291 if xx_flag = 1 then SetFX(FPSCR.XX)
292
293 FRT <- result
294 FPSCR.FPRF <- cls
295 FPSCR.FR <- inc_flag
296 FPSCR.FI <- xx_flag
297 ```
298 <!-- note the PowerISA spec. explicitly has empty lines before/after SetFX,
299 don't remove them -->
300
301 Convert from a unsigned/signed 32/64-bit integer in RB to a 64-bit
302 float in FRT.
303
304 If converting from a unsigned/signed 32-bit integer to a 64-bit float,
305 rounding is never necessary, so `FPSCR` is unmodified and exceptions are
306 never raised. Otherwise, `FPSCR` is modified and exceptions are raised
307 as usual.
308
309 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
310 operations.
311
312 Special Registers altered:
313
314 CR1 (if Rc=1)
315 FPCSR (TODO: which bits?) (if IT[0]=1)
316
317 ### Assembly Aliases
318
319 | Assembly Alias | Full Instruction |&nbsp;| Assembly Alias | Full Instruction |
320 |----------------------|----------------------|------|----------------------|----------------------|
321 | `fcvtfgw FRT, RB` | `fcvtfg FRT, RB, 0` |&nbsp;| `fcvtfgd FRT, RB` | `fcvtfg FRT, RB, 2` |
322 | `fcvtfgw. FRT, RB` | `fcvtfg. FRT, RB, 0` |&nbsp;| `fcvtfgd. FRT, RB` | `fcvtfg. FRT, RB, 2` |
323 | `fcvtfguw FRT, RB` | `fcvtfg FRT, RB, 1` |&nbsp;| `fcvtfgud FRT, RB` | `fcvtfg FRT, RB, 3` |
324 | `fcvtfguw. FRT, RB` | `fcvtfg. FRT, RB, 1` |&nbsp;| `fcvtfgud. FRT, RB` | `fcvtfg. FRT, RB, 3` |
325
326 ----------
327
328 \newpage{}
329
330 ## Floating Convert From Integer In GPR Single
331
332 ```
333 fcvtfgs FRT, RB, IT
334 fcvtfgs. FRT, RB, IT
335 ```
336
337 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-30 | 31 | Form |
338 |-----|------|-------|-------|-------|-------|----|--------|
339 | PO | FRT | IT | 0 | RB | XO | Rc | X-Form |
340
341 ```
342 # rounding may be necessary. based off xscvuxdsp
343 reset_xflags()
344 switch(IT)
345 case(0): # Signed 32-bit
346 src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
347 case(1): # Unsigned 32-bit
348 src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
349 case(2): # Signed 64-bit
350 src <- bfp_CONVERT_FROM_SI64((RB))
351 default: # Unsigned 64-bit
352 src <- bfp_CONVERT_FROM_UI64((RB))
353 rnd <- bfp_ROUND_TO_BFP32(FPSCR.RN, src)
354 result32 <- bfp32_CONVERT_FROM_BFP(rnd)
355 cls <- fprf_CLASS_BFP32(result32)
356 result <- DOUBLE(result32)
357
358 if xx_flag = 1 then SetFX(FPSCR.XX)
359
360 FRT <- result
361 FPSCR.FPRF <- cls
362 FPSCR.FR <- inc_flag
363 FPSCR.FI <- xx_flag
364 ```
365 <!-- note the PowerISA spec. explicitly has empty lines before/after SetFX,
366 don't remove them -->
367
368 Convert from a unsigned/signed 32/64-bit integer in RB to a 32-bit
369 float in FRT, following the usual 32-bit float in 64-bit float format.
370 `FPSCR` is modified and exceptions are raised as usual.
371
372 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
373 operations.
374
375 Special Registers altered:
376
377 CR1 (if Rc=1)
378 FPCSR (TODO: which bits?)
379
380 ### Assembly Aliases
381
382 | Assembly Alias | Full Instruction |&nbsp;| Assembly Alias | Full Instruction |
383 |----------------------|----------------------|------|----------------------|----------------------|
384 | `fcvtfgws FRT, RB` | `fcvtfg FRT, RB, 0` |&nbsp;| `fcvtfgds FRT, RB` | `fcvtfg FRT, RB, 2` |
385 | `fcvtfgws. FRT, RB` | `fcvtfg. FRT, RB, 0` |&nbsp;| `fcvtfgds. FRT, RB` | `fcvtfg. FRT, RB, 2` |
386 | `fcvtfguws FRT, RB` | `fcvtfg FRT, RB, 1` |&nbsp;| `fcvtfguds FRT, RB` | `fcvtfg FRT, RB, 3` |
387 | `fcvtfguws. FRT, RB` | `fcvtfg. FRT, RB, 1` |&nbsp;| `fcvtfguds. FRT, RB` | `fcvtfg. FRT, RB, 3` |
388
389 ----------
390
391 \newpage{}
392
393 ## Floating-point to Integer Conversion Overview
394
395 <div id="fpr-to-gpr-conversion-mode"></div>
396
397 IEEE 754 doesn't specify what results are obtained when converting a NaN
398 or out-of-range floating-point value to integer, so different programming
399 languages and ISAs have made different choices. Below is an overview
400 of the different variants, listing the languages and hardware that
401 implements each variant.
402
403 For convenience, we will give those different conversion semantics names
404 based on which common ISA or programming language uses them, since there
405 may not be an established name for them:
406
407 **Standard OpenPower conversion**
408
409 This conversion performs "saturation with NaN converted to minimum
410 valid integer". This is also exactly the same as the x86 ISA conversion
411 semantics. OpenPOWER however has instructions for both:
412
413 * rounding mode read from FPSCR
414 * rounding mode always set to truncate
415
416 **Java/Saturating conversion**
417
418 For the sake of simplicity, the FP -> Integer conversion semantics
419 generalized from those used by Java's semantics (and Rust's `as`
420 operator) will be referred to as [Java/Saturating conversion
421 semantics](#fp-to-int-java-saturating-conversion-semantics).
422
423 Those same semantics are used in some way by all of the following
424 languages (not necessarily for the default conversion method):
425
426 * Java's
427 [FP -> Integer conversion](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
428 (only for long/int results)
429 * Rust's FP -> Integer conversion using the
430 [`as` operator](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
431 * LLVM's
432 [`llvm.fptosi.sat`](https://llvm.org/docs/LangRef.html#llvm-fptosi-sat-intrinsic) and
433 [`llvm.fptoui.sat`](https://llvm.org/docs/LangRef.html#llvm-fptoui-sat-intrinsic) intrinsics
434 * SPIR-V's OpenCL dialect's
435 [`OpConvertFToU`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToU) and
436 [`OpConvertFToS`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToS)
437 instructions when decorated with
438 [the `SaturatedConversion` decorator](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#_a_id_decoration_a_decoration).
439 * WebAssembly has also introduced
440 [trunc_sat_u](ttps://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-u) and
441 [trunc_sat_s](https://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-s)
442
443 **JavaScript conversion**
444
445 For the sake of simplicity, the FP -> Integer conversion
446 semantics generalized from those used by JavaScripts's `ToInt32`
447 abstract operation will be referred to as [JavaScript conversion
448 semantics](#fp-to-int-javascript-conversion-semantics).
449
450 This instruction is present in ARM assembler as FJCVTZS
451 <https://developer.arm.com/documentation/dui0801/g/hko1477562192868>
452
453 **Rc=1 and OE=1**
454
455 All of these instructions have an Rc=1 mode which sets CR0
456 in the normal way for any instructions producing a GPR result.
457 Additionally, when OE=1, if the numerical value of the FP number
458 is not 100% accurately preserved (due to truncation or saturation
459 and including when the FP number was NaN) then this is considered
460 to be an integer Overflow condition, and CR0.SO, XER.SO and XER.OV
461 are all set as normal for any GPR instructions that overflow.
462
463 \newpage{}
464
465 ### FP to Integer Conversion Simplified Pseudo-code
466
467 Key for pseudo-code:
468
469 | term | result type | definition |
470 |---------------------------|-------------|----------------------------------------------------------------------------------------------------|
471 | `fp` | -- | `f32` or `f64` (or other types from SimpleV) |
472 | `int` | -- | `u32`/`u64`/`i32`/`i64` (or other types from SimpleV) |
473 | `uint` | -- | the unsigned integer of the same bit-width as `int` |
474 | `int::BITS` | `int` | the bit-width of `int` |
475 | `uint::MIN_VALUE` | `uint` | the minimum value `uint` can store: `0` |
476 | `uint::MAX_VALUE` | `uint` | the maximum value `uint` can store: `2^int::BITS - 1` |
477 | `int::MIN_VALUE` | `int` | the minimum value `int` can store : `-2^(int::BITS-1)` |
478 | `int::MAX_VALUE` | `int` | the maximum value `int` can store : `2^(int::BITS-1) - 1` |
479 | `int::VALUE_COUNT` | Integer | the number of different values `int` can store (`2^int::BITS`). too big to fit in `int`. |
480 | `rint(fp, rounding_mode)` | `fp` | rounds the floating-point value `fp` to an integer according to rounding mode `rounding_mode` |
481
482 <div id="fp-to-int-openpower-conversion-semantics"></div>
483 OpenPower conversion semantics (section A.2 page 1009 (page 1035) of
484 Power ISA v3.1B):
485
486 ```
487 def fp_to_int_open_power<fp, int>(v: fp) -> int:
488 if v is NaN:
489 return int::MIN_VALUE
490 if v >= int::MAX_VALUE:
491 return int::MAX_VALUE
492 if v <= int::MIN_VALUE:
493 return int::MIN_VALUE
494 return (int)rint(v, rounding_mode)
495 ```
496
497 <div id="fp-to-int-java-saturating-conversion-semantics"></div>
498 [Java/Saturating conversion semantics](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
499 (only for long/int results)/
500 [Rust semantics](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
501 (with adjustment to add non-truncate rounding modes):
502
503 ```
504 def fp_to_int_java_saturating<fp, int>(v: fp) -> int:
505 if v is NaN:
506 return 0
507 if v >= int::MAX_VALUE:
508 return int::MAX_VALUE
509 if v <= int::MIN_VALUE:
510 return int::MIN_VALUE
511 return (int)rint(v, rounding_mode)
512 ```
513
514 <div id="fp-to-int-javascript-conversion-semantics"></div>
515 Section 7.1 of the ECMAScript / JavaScript
516 [conversion semantics](https://262.ecma-international.org/11.0/#sec-toint32)
517 (with adjustment to add non-truncate rounding modes):
518
519 ```
520 def fp_to_int_java_script<fp, int>(v: fp) -> int:
521 if v is NaN or infinite:
522 return 0
523 v = rint(v, rounding_mode) # assume no loss of precision in result
524 v = v mod int::VALUE_COUNT # 2^32 for i32, 2^64 for i64, result is non-negative
525 bits = (uint)v
526 return (int)bits
527 ```
528
529
530 ----------
531
532 \newpage{}
533
534
535 ## Double-Precision Floating Convert To Integer In GPR
536
537 ```
538 fcvttg RT, FRB, CVM, IT
539 fcvttg. RT, FRB, CVM, IT
540 fcvttgo RT, FRB, CVM, IT
541 fcvttgo. RT, FRB, CVM, IT
542 ```
543
544 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-29 | 30 | 31 | Form |
545 |-----|------|-------|-------|-------|-------|----|----|---------|
546 | PO | RT | IT | CVM | FRB | XO | OE | Rc | XO-Form |
547
548 ```
549 # based on xscvdpuxws
550 reset_xflags()
551 src <- bfp_CONVERT_FROM_BFP64((FRB))
552
553 switch(IT)
554 case(0): # Signed 32-bit
555 range_min <- bfp_CONVERT_FROM_SI32(0x8000_0000)
556 range_max <- bfp_CONVERT_FROM_SI32(0x7FFF_FFFF)
557 js_mask <- 0xFFFF_FFFF
558 case(1): # Unsigned 32-bit
559 range_min <- bfp_CONVERT_FROM_UI32(0)
560 range_max <- bfp_CONVERT_FROM_UI32(0xFFFF_FFFF)
561 js_mask <- 0xFFFF_FFFF
562 case(2): # Signed 64-bit
563 range_min <- bfp_CONVERT_FROM_SI64(-0x8000_0000_0000_0000)
564 range_max <- bfp_CONVERT_FROM_SI64(0x7FFF_FFFF_FFFF_FFFF)
565 js_mask <- 0xFFFF_FFFF_FFFF_FFFF
566 default: # Unsigned 64-bit
567 range_min <- bfp_CONVERT_FROM_UI64(0)
568 range_max <- bfp_CONVERT_FROM_UI64(0xFFFF_FFFF_FFFF_FFFF)
569 js_mask <- 0xFFFF_FFFF_FFFF_FFFF
570
571 if CVM[2] = 1 or FPSCR.RN = 0b01 then
572 rnd <- bfp_ROUND_TO_INTEGER_TRUNC(src)
573 else if FPSCR.RN = 0b00 then
574 rnd <- bfp_ROUND_TO_INTEGER_NEAR_EVEN(src)
575 else if FPSCR.RN = 0b10 then
576 rnd <- bfp_ROUND_TO_INTEGER_CEIL(src)
577 else if FPSCR.RN = 0b11 then
578 rnd <- bfp_ROUND_TO_INTEGER_FLOOR(src)
579
580 switch(CVM)
581 case(0, 1): # OpenPower semantics
582 if IsNaN(rnd) then
583 result <- si64_CONVERT_FROM_BFP(range_min)
584 else if bfp_COMPARE_GT(rnd, range_max) then
585 result <- ui64_CONVERT_FROM_BFP(range_max)
586 else if bfp_COMPARE_LT(rnd, range_min) then
587 result <- si64_CONVERT_FROM_BFP(range_min)
588 else if IT[1] = 1 then # Unsigned 32/64-bit
589 result <- ui64_CONVERT_FROM_BFP(range_max)
590 else # Signed 32/64-bit
591 result <- si64_CONVERT_FROM_BFP(range_max)
592 case(2, 3): # Java/Saturating semantics
593 if IsNaN(rnd) then
594 result <- [0] * 64
595 else if bfp_COMPARE_GT(rnd, range_max) then
596 result <- ui64_CONVERT_FROM_BFP(range_max)
597 else if bfp_COMPARE_LT(rnd, range_min) then
598 result <- si64_CONVERT_FROM_BFP(range_min)
599 else if IT[1] = 1 then # Unsigned 32/64-bit
600 result <- ui64_CONVERT_FROM_BFP(range_max)
601 else # Signed 32/64-bit
602 result <- si64_CONVERT_FROM_BFP(range_max)
603 default: # JavaScript semantics
604 # CVM = 6, 7 are illegal instructions
605 # this works because the largest type we try to convert from has
606 # 53 significand bits, and the largest type we try to convert to
607 # has 64 bits, and the sum of those is strictly less than the 128
608 # bits of the intermediate result.
609 limit <- bfp_CONVERT_FROM_UI128([1] * 128)
610 if IsInf(rnd) or IsNaN(rnd) then
611 result <- [0] * 64
612 else if bfp_COMPARE_GT(bfp_ABSOLUTE(rnd), limit) then
613 result <- [0] * 64
614 else
615 result128 <- si128_CONVERT_FROM_BFP(rnd)
616 result <- result128[64:127] & js_mask
617
618 switch(IT)
619 case(0): # Signed 32-bit
620 result <- EXTS64(result[32:63])
621 result_bfp <- bfp_CONVERT_FROM_SI32(result[32:63])
622 case(1): # Unsigned 32-bit
623 result <- EXTZ64(result[32:63])
624 result_bfp <- bfp_CONVERT_FROM_UI32(result[32:63])
625 case(2): # Signed 64-bit
626 result_bfp <- bfp_CONVERT_FROM_SI64(result)
627 default: # Unsigned 64-bit
628 result_bfp <- bfp_CONVERT_FROM_UI64(result)
629
630 if vxsnan_flag = 1 then SetFX(FPSCR.VXSNAN)
631 if vxcvi_flag = 1 then SetFX(FPSCR.VXCVI)
632 if xx_flag = 1 then SetFX(FPSCR.XX)
633
634 vx_flag <- vxsnan_flag | vxcvi_flag
635 vex_flag <- FPSCR.VE & vx_flag
636
637 if vex_flag = 0 then
638 RT <- result
639 FPSCR.FPRF <- undefined
640 FPSCR.FR <- inc_flag
641 FPSCR.FI <- xx_flag
642 if IsNaN(src) or not bfp_COMPARE_EQ(src, result_bfp) then
643 overflow <- 1 # signals SO only when OE = 1
644 else
645 FPSCR.FR <- 0
646 FPSCR.FI <- 0
647 ```
648
649 Convert from 64-bit float in FRB to a unsigned/signed 32/64-bit integer
650 in RT, with the conversion overflow/rounding semantics following the
651 chosen `CVM` value. `FPSCR` is modified and exceptions are raised as usual.
652
653 These instructions have an Rc=1 mode which sets CR0 in the normal
654 way for any instructions producing a GPR result. Additionally, when OE=1,
655 if the numerical value of the FP number is not 100% accurately preserved
656 (due to truncation or saturation and including when the FP number was
657 NaN) then this is considered to be an Integer Overflow condition, and
658 CR0.SO, XER.SO and XER.OV are all set as normal for any GPR instructions
659 that overflow.
660
661 Special Registers altered:
662
663 CR0 (if Rc=1)
664 XER SO, OV, OV32 (if OE=1)
665 FPCSR (TODO: which bits?)
666
667 ### Assembly Aliases
668
669 | Assembly Alias | Full Instruction | Assembly Alias | Full Instruction |
670 |---------------------------|----------------------------|---------------------------|----------------------------|
671 | `fcvttgw RT, FRB, CVM` | `fcvttg RT, FRB, CVM, 0` | `fcvttgd RT, FRB, CVM` | `fcvttg RT, FRB, CVM, 2` |
672 | `fcvttgw. RT, FRB, CVM` | `fcvttg. RT, FRB, CVM, 0` | `fcvttgd. RT, FRB, CVM` | `fcvttg. RT, FRB, CVM, 2` |
673 | `fcvttgwo RT, FRB, CVM` | `fcvttgo RT, FRB, CVM, 0` | `fcvttgdo RT, FRB, CVM` | `fcvttgo RT, FRB, CVM, 2` |
674 | `fcvttgwo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 0` | `fcvttgdo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 2` |
675 | `fcvttguw RT, FRB, CVM` | `fcvttg RT, FRB, CVM, 1` | `fcvttgud RT, FRB, CVM` | `fcvttg RT, FRB, CVM, 3` |
676 | `fcvttguw. RT, FRB, CVM` | `fcvttg. RT, FRB, CVM, 1` | `fcvttgud. RT, FRB, CVM` | `fcvttg. RT, FRB, CVM, 3` |
677 | `fcvttguwo RT, FRB, CVM` | `fcvttgo RT, FRB, CVM, 1` | `fcvttgudo RT, FRB, CVM` | `fcvttgo RT, FRB, CVM, 3` |
678 | `fcvttguwo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 1` | `fcvttgudo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 3` |
679
680 ----------
681
682 \newpage{}
683
684 ## Floating Convert Single To Integer In GPR
685
686 ```
687 fcvtstg RT, FRB, CVM, IT
688 fcvtstg. RT, FRB, CVM, IT
689 fcvtstgo RT, FRB, CVM, IT
690 fcvtstgo. RT, FRB, CVM, IT
691 ```
692
693 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-29 | 30 | 31 | Form |
694 |-----|------|-------|-------|-------|-------|----|----|---------|
695 | PO | RT | IT | CVM | FRB | XO | OE | Rc | XO-Form |
696
697 ```
698 # based on xscvdpuxws
699 reset_xflags()
700 src <- bfp_CONVERT_FROM_BFP32(SINGLE((FRB)))
701
702 switch(IT)
703 case(0): # Signed 32-bit
704 range_min <- bfp_CONVERT_FROM_SI32(0x8000_0000)
705 range_max <- bfp_CONVERT_FROM_SI32(0x7FFF_FFFF)
706 js_mask <- 0xFFFF_FFFF
707 case(1): # Unsigned 32-bit
708 range_min <- bfp_CONVERT_FROM_UI32(0)
709 range_max <- bfp_CONVERT_FROM_UI32(0xFFFF_FFFF)
710 js_mask <- 0xFFFF_FFFF
711 case(2): # Signed 64-bit
712 range_min <- bfp_CONVERT_FROM_SI64(-0x8000_0000_0000_0000)
713 range_max <- bfp_CONVERT_FROM_SI64(0x7FFF_FFFF_FFFF_FFFF)
714 js_mask <- 0xFFFF_FFFF_FFFF_FFFF
715 default: # Unsigned 64-bit
716 range_min <- bfp_CONVERT_FROM_UI64(0)
717 range_max <- bfp_CONVERT_FROM_UI64(0xFFFF_FFFF_FFFF_FFFF)
718 js_mask <- 0xFFFF_FFFF_FFFF_FFFF
719
720 if CVM[2] = 1 or FPSCR.RN = 0b01 then
721 rnd <- bfp_ROUND_TO_INTEGER_TRUNC(src)
722 else if FPSCR.RN = 0b00 then
723 rnd <- bfp_ROUND_TO_INTEGER_NEAR_EVEN(src)
724 else if FPSCR.RN = 0b10 then
725 rnd <- bfp_ROUND_TO_INTEGER_CEIL(src)
726 else if FPSCR.RN = 0b11 then
727 rnd <- bfp_ROUND_TO_INTEGER_FLOOR(src)
728
729 switch(CVM)
730 case(0, 1): # OpenPower semantics
731 if IsNaN(rnd) then
732 result <- si64_CONVERT_FROM_BFP(range_min)
733 else if bfp_COMPARE_GT(rnd, range_max) then
734 result <- ui64_CONVERT_FROM_BFP(range_max)
735 else if bfp_COMPARE_LT(rnd, range_min) then
736 result <- si64_CONVERT_FROM_BFP(range_min)
737 else if IT[1] = 1 then # Unsigned 32/64-bit
738 result <- ui64_CONVERT_FROM_BFP(range_max)
739 else # Signed 32/64-bit
740 result <- si64_CONVERT_FROM_BFP(range_max)
741 case(2, 3): # Java/Saturating semantics
742 if IsNaN(rnd) then
743 result <- [0] * 64
744 else if bfp_COMPARE_GT(rnd, range_max) then
745 result <- ui64_CONVERT_FROM_BFP(range_max)
746 else if bfp_COMPARE_LT(rnd, range_min) then
747 result <- si64_CONVERT_FROM_BFP(range_min)
748 else if IT[1] = 1 then # Unsigned 32/64-bit
749 result <- ui64_CONVERT_FROM_BFP(range_max)
750 else # Signed 32/64-bit
751 result <- si64_CONVERT_FROM_BFP(range_max)
752 default: # JavaScript semantics
753 # CVM = 6, 7 are illegal instructions
754 # this works because the largest type we try to convert from has
755 # 53 significand bits, and the largest type we try to convert to
756 # has 64 bits, and the sum of those is strictly less than the 128
757 # bits of the intermediate result.
758 limit <- bfp_CONVERT_FROM_UI128([1] * 128)
759 if IsInf(rnd) or IsNaN(rnd) then
760 result <- [0] * 64
761 else if bfp_COMPARE_GT(bfp_ABSOLUTE(rnd), limit) then
762 result <- [0] * 64
763 else
764 result128 <- si128_CONVERT_FROM_BFP(rnd)
765 result <- result128[64:127] & js_mask
766
767 switch(IT)
768 case(0): # Signed 32-bit
769 result <- EXTS64(result[32:63])
770 result_bfp <- bfp_CONVERT_FROM_SI32(result[32:63])
771 case(1): # Unsigned 32-bit
772 result <- EXTZ64(result[32:63])
773 result_bfp <- bfp_CONVERT_FROM_UI32(result[32:63])
774 case(2): # Signed 64-bit
775 result_bfp <- bfp_CONVERT_FROM_SI64(result)
776 default: # Unsigned 64-bit
777 result_bfp <- bfp_CONVERT_FROM_UI64(result)
778
779 if vxsnan_flag = 1 then SetFX(FPSCR.VXSNAN)
780 if vxcvi_flag = 1 then SetFX(FPSCR.VXCVI)
781 if xx_flag = 1 then SetFX(FPSCR.XX)
782
783 vx_flag <- vxsnan_flag | vxcvi_flag
784 vex_flag <- FPSCR.VE & vx_flag
785
786 if vex_flag = 0 then
787 RT <- result
788 FPSCR.FPRF <- undefined
789 FPSCR.FR <- inc_flag
790 FPSCR.FI <- xx_flag
791 if IsNaN(src) or not bfp_COMPARE_EQ(src, result_bfp) then
792 overflow <- 1 # signals SO only when OE = 1
793 else
794 FPSCR.FR <- 0
795 FPSCR.FI <- 0
796 ```
797
798 Convert from 32-bit float in FRB to a unsigned/signed 32/64-bit integer
799 in RT, with the conversion overflow/rounding semantics following the
800 chosen `CVM` value, following the usual 32-bit float in 64-bit float
801 format. `FPSCR` is modified and exceptions are raised as usual.
802
803 These instructions have an Rc=1 mode which sets CR0 in the normal
804 way for any instructions producing a GPR result. Additionally, when OE=1,
805 if the numerical value of the FP number is not 100% accurately preserved
806 (due to truncation or saturation and including when the FP number was
807 NaN) then this is considered to be an Integer Overflow condition, and
808 CR0.SO, XER.SO and XER.OV are all set as normal for any GPR instructions
809 that overflow.
810
811 Special Registers altered:
812
813 CR0 (if Rc=1)
814 XER SO, OV, OV32 (if OE=1)
815 FPCSR (TODO: which bits?)
816
817 ### Assembly Aliases
818
819 | Assembly Alias | Full Instruction | Assembly Alias | Full Instruction |
820 |----------------------------|-----------------------------|----------------------------|-----------------------------|
821 | `fcvtstgw RT, FRB, CVM` | `fcvtstg RT, FRB, CVM, 0` | `fcvtstgd RT, FRB, CVM` | `fcvtstg RT, FRB, CVM, 2` |
822 | `fcvtstgw. RT, FRB, CVM` | `fcvtstg. RT, FRB, CVM, 0` | `fcvtstgd. RT, FRB, CVM` | `fcvtstg. RT, FRB, CVM, 2` |
823 | `fcvtstgwo RT, FRB, CVM` | `fcvtstgo RT, FRB, CVM, 0` | `fcvtstgdo RT, FRB, CVM` | `fcvtstgo RT, FRB, CVM, 2` |
824 | `fcvtstgwo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 0` | `fcvtstgdo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 2` |
825 | `fcvtstguw RT, FRB, CVM` | `fcvtstg RT, FRB, CVM, 1` | `fcvtstgud RT, FRB, CVM` | `fcvtstg RT, FRB, CVM, 3` |
826 | `fcvtstguw. RT, FRB, CVM` | `fcvtstg. RT, FRB, CVM, 1` | `fcvtstgud. RT, FRB, CVM` | `fcvtstg. RT, FRB, CVM, 3` |
827 | `fcvtstguwo RT, FRB, CVM` | `fcvtstgo RT, FRB, CVM, 1` | `fcvtstgudo RT, FRB, CVM` | `fcvtstgo RT, FRB, CVM, 3` |
828 | `fcvtstguwo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 1` | `fcvtstgudo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 3` |
829
830 ----------
831
832 \newpage{}
833
834 ----------
835
836 # Appendices
837
838 Appendix E Power ISA sorted by opcode
839 Appendix F Power ISA sorted by version
840 Appendix G Power ISA sorted by Compliancy Subset
841 Appendix H Power ISA sorted by mnemonic
842
843 |Form| Book | Page | Version | mnemonic | Description |
844 |----|------|------|---------|----------|-------------|
845 |VA | I | # | 3.2B |todo | |
846
847 ----------------
848
849 [[!tag opf_rfc]]