convert fcvttg[o] to fcvt[s]tg[o][.]
[libreriscv.git] / openpower / sv / rfc / ls006.mdwn
1 # RFC ls006 FPR <-> GPR Move/Conversion
2
3 **URLs**:
4
5 * <https://libre-soc.org/openpower/sv/int_fp_mv/>
6 * <https://libre-soc.org/openpower/sv/rfc/ls006/>
7 * <https://bugs.libre-soc.org/show_bug.cgi?id=1015>
8 * <https://git.openpower.foundation/isa/PowerISA/issues/todo>
9
10 **Severity**: Major
11
12 **Status**: New
13
14 **Date**: 20 Oct 2022
15
16 **Target**: v3.2B
17
18 **Source**: v3.1B
19
20 **Books and Section affected**: **UPDATE**
21
22 * Book I 4.6.5 Floating-Point Move Instructions
23 * Book I 4.6.7.2 Floating-Point Convert To/From Integer Instructions
24 * Appendix E Power ISA sorted by opcode
25 * Appendix F Power ISA sorted by version
26 * Appendix G Power ISA sorted by Compliancy Subset
27 * Appendix H Power ISA sorted by mnemonic
28
29 **Summary**
30
31 Instructions added
32
33 * `fmvtg` -- Floating Move to GPR
34 * `fmvfg` -- Floating Move from GPR
35 * `fcvttg`/`fcvttgo` -- Floating Convert to Integer in GPR
36 * `fcvtfg` -- Floating Convert from Integer in GPR
37
38 **Submitter**: Luke Leighton (Libre-SOC)
39
40 **Requester**: Libre-SOC
41
42 **Impact on processor**:
43
44 * Addition of five new GPR-FPR-based instructions
45
46 **Impact on software**:
47
48 * Requires support for new instructions in assembler, debuggers,
49 and related tools.
50
51 **Keywords**:
52
53 ```
54 GPR, FPR, Move, Conversion, JavaScript
55 ```
56
57 **Motivation**
58
59 CPUs without VSX/VMX lack a way to efficiently transfer data between
60 FPRs and GPRs, they need to go through memory, this proposal adds more
61 efficient data transfer (both bitwise copy and Integer <-> FP conversion)
62 instructions that transfer directly between FPRs and GPRs without needing
63 to go through memory.
64
65 IEEE 754 doesn't specify what results are obtained when converting a NaN
66 or out-of-range floating-point value to integer, so different programming
67 languages and ISAs have made different choices. Below is an overview
68 of the different variants, listing the languages and hardware that
69 implements each variant.
70
71 **Notes and Observations**:
72
73 * These instructions are present in many other ISAs.
74 * JavaScript rounding as one instruction saves 35 instructions including
75 six branches. (FIXME: disagrees with int_fp_mv and int_fp_mv/appendix)
76
77 **Changes**
78
79 Add the following entries to:
80
81 * Book I 4.6.5 Floating-Point Move Instructions
82 * Book I 4.6.7.2 Floating-Point Convert To/From Integer Instructions
83 * Book I 1.6.1 and 1.6.2
84
85 ----------------
86
87 \newpage{}
88
89 # Immediate Tables
90
91 Tables that are used by
92 `fmvtg[s][.]`/`fmvfg[s][.]`/`fcvttg[s][.]`/`fcvtfg[s][.]`:
93
94 ## `RCS` -- `Rc` and `s`
95
96 | `RCS` | `Rc` | FP Single Mode | Assembly Alias Mnemonic |
97 |-------|------|----------------|-------------------------|
98 | 0 | 0 | Double | `<op>` |
99 | 1 | 1 | Double | `<op>.` |
100 | 2 | 0 | Single | `<op>s` |
101 | 3 | 1 | Single | `<op>s.` |
102
103 ## `IT` -- Integer Type
104
105 | `IT` | Integer Type | Assembly Alias Mnemonic |
106 |------|-----------------|-------------------------|
107 | 0 | Signed 32-bit | `<op>w` |
108 | 1 | Unsigned 32-bit | `<op>uw` |
109 | 2 | Signed 64-bit | `<op>d` |
110 | 3 | Unsigned 64-bit | `<op>ud` |
111
112 ## `CVM` -- Float to Integer Conversion Mode
113
114 | `CVM` | `rounding_mode` | Semantics |
115 |-------|-----------------|----------------------------------|
116 | 000 | from `FPSCR` | [OpenPower semantics] |
117 | 001 | Truncate | [OpenPower semantics] |
118 | 010 | from `FPSCR` | [Java/Saturating semantics] |
119 | 011 | Truncate | [Java/Saturating semantics] |
120 | 100 | from `FPSCR` | [JavaScript semantics] |
121 | 101 | Truncate | [JavaScript semantics] |
122 | rest | -- | illegal instruction trap for now |
123
124 [OpenPower semantics]: #fp-to-int-openpower-conversion-semantics
125 [Java/Saturating semantics]: #fp-to-int-java-saturating-conversion-semantics
126 [JavaScript semantics]: #fp-to-int-javascript-conversion-semantics
127
128 ----------
129
130 \newpage{}
131
132 ## FPR to GPR Move
133
134 `fmvtg RT, FRB`
135 `fmvtg. RT, FRB`
136
137 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form |
138 |-----|------|-------|-------|-------|----|--------|
139 | PO | RT | 0 | FRB | XO | Rc | X-Form |
140
141 ```
142 RT <- (FRB)
143 ```
144
145 Move a 64-bit float from a FPR to a GPR, just copying bits of the IEEE 754
146 representation directly. This is equivalent to `stfd` followed by `ld`.
147 As `fmvtg` is just copying bits, `FPSCR` is not affected in any way.
148
149 Rc=1 tests RT and sets CR0, exactly like all other Scalar Fixed-Point
150 operations.
151
152 Special Registers altered:
153
154 CR0 (if Rc=1)
155
156 ----------
157
158 \newpage{}
159
160 ## FPR to GPR Move Single
161
162 `fmvtgs RT, FRB`
163 `fmvtgs. RT, FRB`
164
165 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form |
166 |-----|------|-------|-------|-------|----|--------|
167 | PO | RT | 0 | FRB | XO | Rc | X-Form |
168
169 ```
170 RT <- [0] * 32 || SINGLE((FRB)) # SINGLE since that's what stfs uses
171 ```
172
173 Move a 32-bit float from a FPR to a GPR, just copying bits of the IEEE 754
174 representation directly. This is equivalent to `stfs` followed by `lwz`.
175 As `fmvtgs` is just copying bits, `FPSCR` is not affected in any way.
176
177 Rc=1 tests RT and sets CR0, exactly like all other Scalar Fixed-Point
178 operations.
179
180 Special Registers altered:
181
182 CR0 (if Rc=1)
183
184 ----------
185
186 \newpage{}
187
188 ## GPR to FPR Move
189
190 `fmvfg FRT, RB`
191 `fmvfg. FRT, RB`
192
193 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form |
194 |-----|------|-------|-------|-------|----|--------|
195 | PO | FRT | 0 | RB | XO | Rc | X-Form |
196
197 ```
198 FRT <- (RB)
199 ```
200
201 move a 64-bit float from a GPR to a FPR, just copying bits of the IEEE 754
202 representation directly. This is equivalent to `std` followed by `lfd`.
203 As `fmvfg` is just copying bits, `FPSCR` is not affected in any way.
204
205 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
206 operations.
207
208 Special Registers altered:
209
210 CR1 (if Rc=1)
211
212 ----------
213
214 \newpage{}
215
216 ## GPR to FPR Move Single
217
218 `fmvfgs FRT, RB`
219 `fmvfgs. FRT, RB`
220
221 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form |
222 |-----|------|-------|-------|-------|----|--------|
223 | PO | FRT | 0 | RB | XO | Rc | X-Form |
224
225 ```
226 FRT <- DOUBLE((RB)[32:63]) # DOUBLE since that's what lfs uses
227 ```
228
229 move a 32-bit float from a GPR to a FPR, just copying bits of the IEEE 754
230 representation directly. This is equivalent to `stw` followed by `lfs`.
231 As `fmvfgs` is just copying bits, `FPSCR` is not affected in any way.
232
233 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
234 operations.
235
236 Special Registers altered:
237
238 CR1 (if Rc=1)
239
240 ----------
241
242 \newpage{}
243
244 ## Floating-point Convert From GPR
245
246 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-30 | 31 | Form |
247 |-----|------|-------|-------|-------|-------|----|--------|
248 | PO | FRT | IT | 0 | RB | XO | Rc | X-Form |
249
250 `fcvtfg FRT, RB, IT`
251 `fcvtfg. FRT, RB, IT`
252 `fcvtfgs FRT, RB, IT`
253 `fcvtfgs. FRT, RB, IT`
254
255 ```
256 if IT[0] = 0 then # 32-bit int -> 64-bit float
257 # rounding never necessary, so don't touch FPSCR
258 # based off xvcvsxwdp
259 if IT = 0 then # Signed 32-bit
260 src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
261 else # IT = 1 -- Unsigned 32-bit
262 src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
263 FRT <- bfp64_CONVERT_FROM_BFP(src)
264 else
265 # rounding may be necessary. based off xscvuxdsp
266 reset_xflags()
267 switch(IT)
268 case(0): # Signed 32-bit
269 src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
270 case(1): # Unsigned 32-bit
271 src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
272 case(2): # Signed 64-bit
273 src <- bfp_CONVERT_FROM_SI64((RB))
274 default: # Unsigned 64-bit
275 src <- bfp_CONVERT_FROM_UI64((RB))
276 rnd <- bfp_ROUND_TO_BFP64(FPSCR.RN, src)
277 result <- bfp64_CONVERT_FROM_BFP(rnd)
278 cls <- fprf_CLASS_BFP64(result)
279
280 if xx_flag = 1 then SetFX(FPSCR.XX)
281
282 FRT <- result
283 FPSCR.FPRF <- cls
284 FPSCR.FR <- inc_flag
285 FPSCR.FI <- xx_flag
286 ```
287 <!-- note the PowerISA spec. explicitly has empty lines before/after SetFX,
288 don't remove them -->
289
290 Convert from a unsigned/signed 32/64-bit integer in RB to a 64-bit
291 float in FRT.
292
293 If converting from a unsigned/signed 32-bit integer to a 64-bit float,
294 rounding is never necessary, so `FPSCR` is unmodified and exceptions are
295 never raised. Otherwise, `FPSCR` is modified and exceptions are raised
296 as usual.
297
298 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
299 operations.
300
301 Special Registers altered:
302
303 CR1 (if Rc=1)
304 FPCSR (TODO: which bits?) (if IT[0]=1)
305
306 ### Assembly Aliases
307
308 | Assembly Alias | Full Instruction |&nbsp;| Assembly Alias | Full Instruction |
309 |----------------------|----------------------|------|----------------------|----------------------|
310 | `fcvtfgw FRT, RB` | `fcvtfg FRT, RB, 0` |&nbsp;| `fcvtfgd FRT, RB` | `fcvtfg FRT, RB, 2` |
311 | `fcvtfgw. FRT, RB` | `fcvtfg. FRT, RB, 0` |&nbsp;| `fcvtfgd. FRT, RB` | `fcvtfg. FRT, RB, 2` |
312 | `fcvtfguw FRT, RB` | `fcvtfg FRT, RB, 1` |&nbsp;| `fcvtfgud FRT, RB` | `fcvtfg FRT, RB, 3` |
313 | `fcvtfguw. FRT, RB` | `fcvtfg. FRT, RB, 1` |&nbsp;| `fcvtfgud. FRT, RB` | `fcvtfg. FRT, RB, 3` |
314
315 ----------
316
317 \newpage{}
318
319 ## Floating-point Convert From GPR Single
320
321 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-30 | 31 | Form |
322 |-----|------|-------|-------|-------|-------|----|--------|
323 | PO | FRT | IT | 0 | RB | XO | Rc | X-Form |
324
325 `fcvtfgs FRT, RB, IT`
326 `fcvtfgs. FRT, RB, IT`
327
328 ```
329 # rounding may be necessary. based off xscvuxdsp
330 reset_xflags()
331 switch(IT)
332 case(0): # Signed 32-bit
333 src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
334 case(1): # Unsigned 32-bit
335 src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
336 case(2): # Signed 64-bit
337 src <- bfp_CONVERT_FROM_SI64((RB))
338 default: # Unsigned 64-bit
339 src <- bfp_CONVERT_FROM_UI64((RB))
340 rnd <- bfp_ROUND_TO_BFP32(FPSCR.RN, src)
341 result32 <- bfp32_CONVERT_FROM_BFP(rnd)
342 cls <- fprf_CLASS_BFP32(result32)
343 result <- DOUBLE(result32)
344
345 if xx_flag = 1 then SetFX(FPSCR.XX)
346
347 FRT <- result
348 FPSCR.FPRF <- cls
349 FPSCR.FR <- inc_flag
350 FPSCR.FI <- xx_flag
351 ```
352 <!-- note the PowerISA spec. explicitly has empty lines before/after SetFX,
353 don't remove them -->
354
355 Convert from a unsigned/signed 32/64-bit integer in RB to a 32-bit
356 float in FRT, following the usual 32-bit float in 64-bit float format.
357 `FPSCR` is modified and exceptions are raised as usual.
358
359 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
360 operations.
361
362 Special Registers altered:
363
364 CR1 (if Rc=1)
365 FPCSR (TODO: which bits?)
366
367 ### Assembly Aliases
368
369 | Assembly Alias | Full Instruction |&nbsp;| Assembly Alias | Full Instruction |
370 |----------------------|----------------------|------|----------------------|----------------------|
371 | `fcvtfgws FRT, RB` | `fcvtfg FRT, RB, 0` |&nbsp;| `fcvtfgds FRT, RB` | `fcvtfg FRT, RB, 2` |
372 | `fcvtfgws. FRT, RB` | `fcvtfg. FRT, RB, 0` |&nbsp;| `fcvtfgds. FRT, RB` | `fcvtfg. FRT, RB, 2` |
373 | `fcvtfguws FRT, RB` | `fcvtfg FRT, RB, 1` |&nbsp;| `fcvtfguds FRT, RB` | `fcvtfg FRT, RB, 3` |
374 | `fcvtfguws. FRT, RB` | `fcvtfg. FRT, RB, 1` |&nbsp;| `fcvtfguds. FRT, RB` | `fcvtfg. FRT, RB, 3` |
375
376 ----------
377
378 \newpage{}
379
380 ## Floating-point to Integer Conversion Overview
381
382 <div id="fpr-to-gpr-conversion-mode"></div>
383
384 IEEE 754 doesn't specify what results are obtained when converting a NaN
385 or out-of-range floating-point value to integer, so different programming
386 languages and ISAs have made different choices. Below is an overview
387 of the different variants, listing the languages and hardware that
388 implements each variant.
389
390 For convenience, we will give those different conversion semantics names
391 based on which common ISA or programming language uses them, since there
392 may not be an established name for them:
393
394 **Standard OpenPower conversion**
395
396 This conversion performs "saturation with NaN converted to minimum
397 valid integer". This is also exactly the same as the x86 ISA conversion
398 semantics. OpenPOWER however has instructions for both:
399
400 * rounding mode read from FPSCR
401 * rounding mode always set to truncate
402
403 **Java/Saturating conversion**
404
405 For the sake of simplicity, the FP -> Integer conversion semantics
406 generalized from those used by Java's semantics (and Rust's `as`
407 operator) will be referred to as [Java/Saturating conversion
408 semantics](#fp-to-int-java-saturating-conversion-semantics).
409
410 Those same semantics are used in some way by all of the following
411 languages (not necessarily for the default conversion method):
412
413 * Java's
414 [FP -> Integer conversion](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
415 (only for long/int results)
416 * Rust's FP -> Integer conversion using the
417 [`as` operator](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
418 * LLVM's
419 [`llvm.fptosi.sat`](https://llvm.org/docs/LangRef.html#llvm-fptosi-sat-intrinsic) and
420 [`llvm.fptoui.sat`](https://llvm.org/docs/LangRef.html#llvm-fptoui-sat-intrinsic) intrinsics
421 * SPIR-V's OpenCL dialect's
422 [`OpConvertFToU`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToU) and
423 [`OpConvertFToS`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToS)
424 instructions when decorated with
425 [the `SaturatedConversion` decorator](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#_a_id_decoration_a_decoration).
426 * WebAssembly has also introduced
427 [trunc_sat_u](ttps://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-u) and
428 [trunc_sat_s](https://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-s)
429
430 **JavaScript conversion**
431
432 For the sake of simplicity, the FP -> Integer conversion
433 semantics generalized from those used by JavaScripts's `ToInt32`
434 abstract operation will be referred to as [JavaScript conversion
435 semantics](#fp-to-int-javascript-conversion-semantics).
436
437 This instruction is present in ARM assembler as FJCVTZS
438 <https://developer.arm.com/documentation/dui0801/g/hko1477562192868>
439
440 **Rc=1 and OE=1**
441
442 All of these instructions have an Rc=1 mode which sets CR0
443 in the normal way for any instructions producing a GPR result.
444 Additionally, when OE=1, if the numerical value of the FP number
445 is not 100% accurately preserved (due to truncation or saturation
446 and including when the FP number was NaN) then this is considered
447 to be an integer Overflow condition, and CR0.SO, XER.SO and XER.OV
448 are all set as normal for any GPR instructions that overflow.
449
450 \newpage{}
451
452 ### FP to Integer Conversion Simplified Pseudo-code
453
454 Key for pseudo-code:
455
456 | term | result type | definition |
457 |---------------------------|-------------|----------------------------------------------------------------------------------------------------|
458 | `fp` | -- | `f32` or `f64` (or other types from SimpleV) |
459 | `int` | -- | `u32`/`u64`/`i32`/`i64` (or other types from SimpleV) |
460 | `uint` | -- | the unsigned integer of the same bit-width as `int` |
461 | `int::BITS` | `int` | the bit-width of `int` |
462 | `uint::MIN_VALUE` | `uint` | the minimum value `uint` can store: `0` |
463 | `uint::MAX_VALUE` | `uint` | the maximum value `uint` can store: `2^int::BITS - 1` |
464 | `int::MIN_VALUE` | `int` | the minimum value `int` can store : `-2^(int::BITS-1)` |
465 | `int::MAX_VALUE` | `int` | the maximum value `int` can store : `2^(int::BITS-1) - 1` |
466 | `int::VALUE_COUNT` | Integer | the number of different values `int` can store (`2^int::BITS`). too big to fit in `int`. |
467 | `rint(fp, rounding_mode)` | `fp` | rounds the floating-point value `fp` to an integer according to rounding mode `rounding_mode` |
468
469 <div id="fp-to-int-openpower-conversion-semantics"></div>
470 OpenPower conversion semantics (section A.2 page 1009 (page 1035) of
471 Power ISA v3.1B):
472
473 ```
474 def fp_to_int_open_power<fp, int>(v: fp) -> int:
475 if v is NaN:
476 return int::MIN_VALUE
477 if v >= int::MAX_VALUE:
478 return int::MAX_VALUE
479 if v <= int::MIN_VALUE:
480 return int::MIN_VALUE
481 return (int)rint(v, rounding_mode)
482 ```
483
484 <div id="fp-to-int-java-saturating-conversion-semantics"></div>
485 [Java/Saturating conversion semantics](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
486 (only for long/int results)/
487 [Rust semantics](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
488 (with adjustment to add non-truncate rounding modes):
489
490 ```
491 def fp_to_int_java_saturating<fp, int>(v: fp) -> int:
492 if v is NaN:
493 return 0
494 if v >= int::MAX_VALUE:
495 return int::MAX_VALUE
496 if v <= int::MIN_VALUE:
497 return int::MIN_VALUE
498 return (int)rint(v, rounding_mode)
499 ```
500
501 <div id="fp-to-int-javascript-conversion-semantics"></div>
502 Section 7.1 of the ECMAScript / JavaScript
503 [conversion semantics](https://262.ecma-international.org/11.0/#sec-toint32)
504 (with adjustment to add non-truncate rounding modes):
505
506 ```
507 def fp_to_int_java_script<fp, int>(v: fp) -> int:
508 if v is NaN or infinite:
509 return 0
510 v = rint(v, rounding_mode) # assume no loss of precision in result
511 v = v mod int::VALUE_COUNT # 2^32 for i32, 2^64 for i64, result is non-negative
512 bits = (uint)v
513 return (int)bits
514 ```
515
516
517 ----------
518
519 \newpage{}
520
521
522 ## Floating-point Convert To GPR
523
524 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-29 | 30 | 31 | Form |
525 |-----|------|-------|-------|-------|-------|----|----|---------|
526 | PO | RT | IT | CVM | FRB | XO | OE | Rc | XO-Form |
527
528 `fcvttg RT, FRB, CVM, IT`
529 `fcvttg. RT, FRB, CVM, IT`
530 `fcvttgo RT, FRB, CVM, IT`
531 `fcvttgo. RT, FRB, CVM, IT`
532
533 ```
534 # based on xscvdpuxws
535 reset_xflags()
536 src <- bfp_CONVERT_FROM_BFP64((FRB))
537
538 switch(IT)
539 case(0): # Signed 32-bit
540 range_min <- bfp_CONVERT_FROM_SI32(0x8000_0000)
541 range_max <- bfp_CONVERT_FROM_SI32(0x7FFF_FFFF)
542 js_mask <- 0xFFFF_FFFF
543 case(1): # Unsigned 32-bit
544 range_min <- bfp_CONVERT_FROM_UI32(0)
545 range_max <- bfp_CONVERT_FROM_UI32(0xFFFF_FFFF)
546 js_mask <- 0xFFFF_FFFF
547 case(2): # Signed 64-bit
548 range_min <- bfp_CONVERT_FROM_SI64(-0x8000_0000_0000_0000)
549 range_max <- bfp_CONVERT_FROM_SI64(0x7FFF_FFFF_FFFF_FFFF)
550 js_mask <- 0xFFFF_FFFF_FFFF_FFFF
551 default: # Unsigned 64-bit
552 range_min <- bfp_CONVERT_FROM_UI64(0)
553 range_max <- bfp_CONVERT_FROM_UI64(0xFFFF_FFFF_FFFF_FFFF)
554 js_mask <- 0xFFFF_FFFF_FFFF_FFFF
555
556 if CVM[2] = 1 or FPSCR.RN = 0b01 then
557 rnd <- bfp_ROUND_TO_INTEGER_TRUNC(src)
558 else if FPSCR.RN = 0b00 then
559 rnd <- bfp_ROUND_TO_INTEGER_NEAR_EVEN(src)
560 else if FPSCR.RN = 0b10 then
561 rnd <- bfp_ROUND_TO_INTEGER_CEIL(src)
562 else if FPSCR.RN = 0b11 then
563 rnd <- bfp_ROUND_TO_INTEGER_FLOOR(src)
564
565 switch(CVM)
566 case(0, 1): # OpenPower semantics
567 if IsNaN(rnd) then
568 result <- si64_CONVERT_FROM_BFP(range_min)
569 else if bfp_COMPARE_GT(rnd, range_max) then
570 result <- ui64_CONVERT_FROM_BFP(range_max)
571 else if bfp_COMPARE_LT(rnd, range_min) then
572 result <- si64_CONVERT_FROM_BFP(range_min)
573 else if IT[1] = 1 then # Unsigned 32/64-bit
574 result <- ui64_CONVERT_FROM_BFP(range_max)
575 else # Signed 32/64-bit
576 result <- si64_CONVERT_FROM_BFP(range_max)
577 case(2, 3): # Java/Saturating semantics
578 if IsNaN(rnd) then
579 result <- [0] * 64
580 else if bfp_COMPARE_GT(rnd, range_max) then
581 result <- ui64_CONVERT_FROM_BFP(range_max)
582 else if bfp_COMPARE_LT(rnd, range_min) then
583 result <- si64_CONVERT_FROM_BFP(range_min)
584 else if IT[1] = 1 then # Unsigned 32/64-bit
585 result <- ui64_CONVERT_FROM_BFP(range_max)
586 else # Signed 32/64-bit
587 result <- si64_CONVERT_FROM_BFP(range_max)
588 default: # JavaScript semantics
589 # CVM = 6, 7 are illegal instructions
590 # this works because the largest type we try to convert from has
591 # 53 significand bits, and the largest type we try to convert to
592 # has 64 bits, and the sum of those is strictly less than the 128
593 # bits of the intermediate result.
594 limit <- bfp_CONVERT_FROM_UI128([1] * 128)
595 if IsInf(rnd) or IsNaN(rnd) then
596 result <- [0] * 64
597 else if bfp_COMPARE_GT(bfp_ABSOLUTE(rnd), limit) then
598 result <- [0] * 64
599 else
600 result128 <- si128_CONVERT_FROM_BFP(rnd)
601 result <- result128[64:127] & js_mask
602
603 switch(IT)
604 case(0): # Signed 32-bit
605 result <- EXTS64(result[32:63])
606 result_bfp <- bfp_CONVERT_FROM_SI32(result[32:63])
607 case(1): # Unsigned 32-bit
608 result <- EXTZ64(result[32:63])
609 result_bfp <- bfp_CONVERT_FROM_UI32(result[32:63])
610 case(2): # Signed 64-bit
611 result_bfp <- bfp_CONVERT_FROM_SI64(result)
612 default: # Unsigned 64-bit
613 result_bfp <- bfp_CONVERT_FROM_UI64(result)
614
615 if vxsnan_flag = 1 then SetFX(FPSCR.VXSNAN)
616 if vxcvi_flag = 1 then SetFX(FPSCR.VXCVI)
617 if xx_flag = 1 then SetFX(FPSCR.XX)
618
619 vx_flag <- vxsnan_flag | vxcvi_flag
620 vex_flag <- FPSCR.VE & vx_flag
621
622 if vex_flag = 0 then
623 RT <- result
624 FPSCR.FPRF <- undefined
625 FPSCR.FR <- inc_flag
626 FPSCR.FI <- xx_flag
627 if IsNaN(src) or not bfp_COMPARE_EQ(src, result_bfp) then
628 overflow <- 1 # signals SO only when OE = 1
629 else
630 FPSCR.FR <- 0
631 FPSCR.FI <- 0
632 ```
633
634 Convert from 64-bit float in FRB to a unsigned/signed 32/64-bit integer
635 in RT, with the conversion overflow/rounding semantics following the
636 chosen `CVM` value. `FPSCR` is modified and exceptions are raised as usual.
637
638 These instructions have an Rc=1 mode which sets CR0 in the normal
639 way for any instructions producing a GPR result. Additionally, when OE=1,
640 if the numerical value of the FP number is not 100% accurately preserved
641 (due to truncation or saturation and including when the FP number was
642 NaN) then this is considered to be an Integer Overflow condition, and
643 CR0.SO, XER.SO and XER.OV are all set as normal for any GPR instructions
644 that overflow.
645
646 Special Registers altered:
647
648 CR0 (if Rc=1)
649 XER SO, OV, OV32 (if OE=1)
650 FPCSR (TODO: which bits?)
651
652 ### Assembly Aliases
653
654 | Assembly Alias | Full Instruction | Assembly Alias | Full Instruction |
655 |---------------------------|----------------------------|---------------------------|----------------------------|
656 | `fcvttgw RT, FRB, CVM` | `fcvttg RT, FRB, CVM, 0` | `fcvttgd RT, FRB, CVM` | `fcvttg RT, FRB, CVM, 2` |
657 | `fcvttgw. RT, FRB, CVM` | `fcvttg. RT, FRB, CVM, 0` | `fcvttgd. RT, FRB, CVM` | `fcvttg. RT, FRB, CVM, 2` |
658 | `fcvttgwo RT, FRB, CVM` | `fcvttgo RT, FRB, CVM, 0` | `fcvttgdo RT, FRB, CVM` | `fcvttgo RT, FRB, CVM, 2` |
659 | `fcvttgwo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 0` | `fcvttgdo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 2` |
660 | `fcvttguw RT, FRB, CVM` | `fcvttg RT, FRB, CVM, 1` | `fcvttgud RT, FRB, CVM` | `fcvttg RT, FRB, CVM, 3` |
661 | `fcvttguw. RT, FRB, CVM` | `fcvttg. RT, FRB, CVM, 1` | `fcvttgud. RT, FRB, CVM` | `fcvttg. RT, FRB, CVM, 3` |
662 | `fcvttguwo RT, FRB, CVM` | `fcvttgo RT, FRB, CVM, 1` | `fcvttgudo RT, FRB, CVM` | `fcvttgo RT, FRB, CVM, 3` |
663 | `fcvttguwo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 1` | `fcvttgudo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 3` |
664
665 ----------
666
667 \newpage{}
668
669 ## Floating-point Convert To GPR Single
670
671 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-29 | 30 | 31 | Form |
672 |-----|------|-------|-------|-------|-------|----|----|---------|
673 | PO | RT | IT | CVM | FRB | XO | OE | Rc | XO-Form |
674
675 `fcvtstg RT, FRB, CVM, IT`
676 `fcvtstg. RT, FRB, CVM, IT`
677 `fcvtstgo RT, FRB, CVM, IT`
678 `fcvtstgo. RT, FRB, CVM, IT`
679
680 ```
681 # based on xscvdpuxws
682 reset_xflags()
683 src <- bfp_CONVERT_FROM_BFP32(SINGLE((FRB)))
684
685 switch(IT)
686 case(0): # Signed 32-bit
687 range_min <- bfp_CONVERT_FROM_SI32(0x8000_0000)
688 range_max <- bfp_CONVERT_FROM_SI32(0x7FFF_FFFF)
689 js_mask <- 0xFFFF_FFFF
690 case(1): # Unsigned 32-bit
691 range_min <- bfp_CONVERT_FROM_UI32(0)
692 range_max <- bfp_CONVERT_FROM_UI32(0xFFFF_FFFF)
693 js_mask <- 0xFFFF_FFFF
694 case(2): # Signed 64-bit
695 range_min <- bfp_CONVERT_FROM_SI64(-0x8000_0000_0000_0000)
696 range_max <- bfp_CONVERT_FROM_SI64(0x7FFF_FFFF_FFFF_FFFF)
697 js_mask <- 0xFFFF_FFFF_FFFF_FFFF
698 default: # Unsigned 64-bit
699 range_min <- bfp_CONVERT_FROM_UI64(0)
700 range_max <- bfp_CONVERT_FROM_UI64(0xFFFF_FFFF_FFFF_FFFF)
701 js_mask <- 0xFFFF_FFFF_FFFF_FFFF
702
703 if CVM[2] = 1 or FPSCR.RN = 0b01 then
704 rnd <- bfp_ROUND_TO_INTEGER_TRUNC(src)
705 else if FPSCR.RN = 0b00 then
706 rnd <- bfp_ROUND_TO_INTEGER_NEAR_EVEN(src)
707 else if FPSCR.RN = 0b10 then
708 rnd <- bfp_ROUND_TO_INTEGER_CEIL(src)
709 else if FPSCR.RN = 0b11 then
710 rnd <- bfp_ROUND_TO_INTEGER_FLOOR(src)
711
712 switch(CVM)
713 case(0, 1): # OpenPower semantics
714 if IsNaN(rnd) then
715 result <- si64_CONVERT_FROM_BFP(range_min)
716 else if bfp_COMPARE_GT(rnd, range_max) then
717 result <- ui64_CONVERT_FROM_BFP(range_max)
718 else if bfp_COMPARE_LT(rnd, range_min) then
719 result <- si64_CONVERT_FROM_BFP(range_min)
720 else if IT[1] = 1 then # Unsigned 32/64-bit
721 result <- ui64_CONVERT_FROM_BFP(range_max)
722 else # Signed 32/64-bit
723 result <- si64_CONVERT_FROM_BFP(range_max)
724 case(2, 3): # Java/Saturating semantics
725 if IsNaN(rnd) then
726 result <- [0] * 64
727 else if bfp_COMPARE_GT(rnd, range_max) then
728 result <- ui64_CONVERT_FROM_BFP(range_max)
729 else if bfp_COMPARE_LT(rnd, range_min) then
730 result <- si64_CONVERT_FROM_BFP(range_min)
731 else if IT[1] = 1 then # Unsigned 32/64-bit
732 result <- ui64_CONVERT_FROM_BFP(range_max)
733 else # Signed 32/64-bit
734 result <- si64_CONVERT_FROM_BFP(range_max)
735 default: # JavaScript semantics
736 # CVM = 6, 7 are illegal instructions
737 # this works because the largest type we try to convert from has
738 # 53 significand bits, and the largest type we try to convert to
739 # has 64 bits, and the sum of those is strictly less than the 128
740 # bits of the intermediate result.
741 limit <- bfp_CONVERT_FROM_UI128([1] * 128)
742 if IsInf(rnd) or IsNaN(rnd) then
743 result <- [0] * 64
744 else if bfp_COMPARE_GT(bfp_ABSOLUTE(rnd), limit) then
745 result <- [0] * 64
746 else
747 result128 <- si128_CONVERT_FROM_BFP(rnd)
748 result <- result128[64:127] & js_mask
749
750 switch(IT)
751 case(0): # Signed 32-bit
752 result <- EXTS64(result[32:63])
753 result_bfp <- bfp_CONVERT_FROM_SI32(result[32:63])
754 case(1): # Unsigned 32-bit
755 result <- EXTZ64(result[32:63])
756 result_bfp <- bfp_CONVERT_FROM_UI32(result[32:63])
757 case(2): # Signed 64-bit
758 result_bfp <- bfp_CONVERT_FROM_SI64(result)
759 default: # Unsigned 64-bit
760 result_bfp <- bfp_CONVERT_FROM_UI64(result)
761
762 if vxsnan_flag = 1 then SetFX(FPSCR.VXSNAN)
763 if vxcvi_flag = 1 then SetFX(FPSCR.VXCVI)
764 if xx_flag = 1 then SetFX(FPSCR.XX)
765
766 vx_flag <- vxsnan_flag | vxcvi_flag
767 vex_flag <- FPSCR.VE & vx_flag
768
769 if vex_flag = 0 then
770 RT <- result
771 FPSCR.FPRF <- undefined
772 FPSCR.FR <- inc_flag
773 FPSCR.FI <- xx_flag
774 if IsNaN(src) or not bfp_COMPARE_EQ(src, result_bfp) then
775 overflow <- 1 # signals SO only when OE = 1
776 else
777 FPSCR.FR <- 0
778 FPSCR.FI <- 0
779 ```
780
781 Convert from 32-bit float in FRB to a unsigned/signed 32/64-bit integer
782 in RT, with the conversion overflow/rounding semantics following the
783 chosen `CVM` value, following the usual 32-bit float in 64-bit float
784 format. `FPSCR` is modified and exceptions are raised as usual.
785
786 These instructions have an Rc=1 mode which sets CR0 in the normal
787 way for any instructions producing a GPR result. Additionally, when OE=1,
788 if the numerical value of the FP number is not 100% accurately preserved
789 (due to truncation or saturation and including when the FP number was
790 NaN) then this is considered to be an Integer Overflow condition, and
791 CR0.SO, XER.SO and XER.OV are all set as normal for any GPR instructions
792 that overflow.
793
794 Special Registers altered:
795
796 CR0 (if Rc=1)
797 XER SO, OV, OV32 (if OE=1)
798 FPCSR (TODO: which bits?)
799
800 ### Assembly Aliases
801
802 | Assembly Alias | Full Instruction | Assembly Alias | Full Instruction |
803 |----------------------------|-----------------------------|----------------------------|-----------------------------|
804 | `fcvtstgw RT, FRB, CVM` | `fcvtstg RT, FRB, CVM, 0` | `fcvtstgd RT, FRB, CVM` | `fcvtstg RT, FRB, CVM, 2` |
805 | `fcvtstgw. RT, FRB, CVM` | `fcvtstg. RT, FRB, CVM, 0` | `fcvtstgd. RT, FRB, CVM` | `fcvtstg. RT, FRB, CVM, 2` |
806 | `fcvtstgwo RT, FRB, CVM` | `fcvtstgo RT, FRB, CVM, 0` | `fcvtstgdo RT, FRB, CVM` | `fcvtstgo RT, FRB, CVM, 2` |
807 | `fcvtstgwo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 0` | `fcvtstgdo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 2` |
808 | `fcvtstguw RT, FRB, CVM` | `fcvtstg RT, FRB, CVM, 1` | `fcvtstgud RT, FRB, CVM` | `fcvtstg RT, FRB, CVM, 3` |
809 | `fcvtstguw. RT, FRB, CVM` | `fcvtstg. RT, FRB, CVM, 1` | `fcvtstgud. RT, FRB, CVM` | `fcvtstg. RT, FRB, CVM, 3` |
810 | `fcvtstguwo RT, FRB, CVM` | `fcvtstgo RT, FRB, CVM, 1` | `fcvtstgudo RT, FRB, CVM` | `fcvtstgo RT, FRB, CVM, 3` |
811 | `fcvtstguwo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 1` | `fcvtstgudo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 3` |
812
813 ----------
814
815 \newpage{}
816
817 ----------
818
819 # Appendices
820
821 Appendix E Power ISA sorted by opcode
822 Appendix F Power ISA sorted by version
823 Appendix G Power ISA sorted by Compliancy Subset
824 Appendix H Power ISA sorted by mnemonic
825
826 |Form| Book | Page | Version | mnemonic | Description |
827 |----|------|------|---------|----------|-------------|
828 |VA | I | # | 3.2B |todo | |
829
830 ----------------
831
832 [[!tag opf_rfc]]