3ac1e719d2ae61abeda52b62f174592d51feaf69
[libreriscv.git] / openpower / sv / rfc / ls006.mdwn
1 # RFC ls006 FPR <-> GPR Move/Conversion
2
3 **URLs**:
4
5 * <https://libre-soc.org/openpower/sv/int_fp_mv/>
6 * <https://libre-soc.org/openpower/sv/rfc/ls006/>
7 * <https://bugs.libre-soc.org/show_bug.cgi?id=1015>
8 * <https://git.openpower.foundation/isa/PowerISA/issues/todo>
9
10 **Severity**: Major
11
12 **Status**: New
13
14 **Date**: 20 Oct 2022
15
16 **Target**: v3.2B
17
18 **Source**: v3.1B
19
20 **Books and Section affected**: **UPDATE**
21
22 * Book I 4.6.5 Floating-Point Move Instructions
23 * Book I 4.6.7.2 Floating-Point Convert To/From Integer Instructions
24 * Appendix E Power ISA sorted by opcode
25 * Appendix F Power ISA sorted by version
26 * Appendix G Power ISA sorted by Compliancy Subset
27 * Appendix H Power ISA sorted by mnemonic
28
29 **Summary**
30
31 Single-precision Instructions added:
32
33 * `fmvtgs` -- Single-Precision Floating Move To GPR
34 * `fmvfgs` -- Single-Precision Floating Move From GPR
35 * `fcvttgs` -- Single-Precision Floating Convert To Integer In GPR
36 * `fcvtfgs` -- Single-Precision Floating Convert From Integer In GPR
37
38 Identical (except Double-precision) Instructions added:
39
40 * `fmvtg` -- Double-Precision Floating Move To GPR
41 * `fmvfg` -- Double-Precision Floating Move From GPR
42 * `fcvttg` -- Double-Precision Floating Convert To Integer In GPR
43 * `fcvtfg` -- Double-Precision Floating Convert From Integer In GPR
44
45 **Submitter**: Luke Leighton (Libre-SOC)
46
47 **Requester**: Libre-SOC
48
49 **Impact on processor**:
50
51 * Addition of four new Single-Precision GPR-FPR-based instructions
52 * Addition of four new Double-Precision GPR-FPR-based instructions
53
54 **Impact on software**:
55
56 * Requires support for new instructions in assembler, debuggers,
57 and related tools.
58
59 **Keywords**:
60
61 ```
62 GPR, FPR, Move, Conversion, JavaScript
63 ```
64
65 **Motivation**
66
67 CPUs without VSX/VMX lack a way to efficiently transfer data between
68 FPRs and GPRs, they need to go through memory, this proposal adds more
69 efficient data transfer (both bitwise copy and Integer <-> FP conversion)
70 instructions that transfer directly between FPRs and GPRs without needing
71 to go through memory.
72
73 IEEE 754 doesn't specify what results are obtained when converting a NaN
74 or out-of-range floating-point value to integer, so different programming
75 languages and ISAs have made different choices. Below is an overview
76 of the different variants, listing the languages and hardware that
77 implements each variant.
78
79 **Notes and Observations**:
80
81 * These instructions are present in many other ISAs.
82 * JavaScript rounding as one instruction saves 35 instructions including
83 six branches. (FIXME: disagrees with int_fp_mv and int_fp_mv/appendix)
84 * Both sets are orthogonal (no difference except being Single/Double).
85 This allows IBM to follow the pre-existing precedent of allocating
86 separate Major Opcodes (PO) for Double-precision and Single-precision
87 respectively.
88
89 **Changes**
90
91 Add the following entries to:
92
93 * Book I 4.6.5 Floating-Point Move Instructions
94 * Book I 4.6.7.2 Floating-Point Convert To/From Integer Instructions
95 * Book I 1.6.1 and 1.6.2
96
97 ----------------
98
99 \newpage{}
100
101 # Immediate Tables
102
103 Tables that are used by
104 `fmvtg[s][.]`/`fmvfg[s][.]`/`fcvt[s]tg[o][.]`/`fcvtfg[s][.]`:
105
106 ## `IT` -- Integer Type
107
108 | `IT` | Integer Type | Assembly Alias Mnemonic |
109 |------|-----------------|-------------------------|
110 | 0 | Signed 32-bit | `<op>w` |
111 | 1 | Unsigned 32-bit | `<op>uw` |
112 | 2 | Signed 64-bit | `<op>d` |
113 | 3 | Unsigned 64-bit | `<op>ud` |
114
115 ## `CVM` -- Float to Integer Conversion Mode
116
117 | `CVM` | `rounding_mode` | Semantics |
118 |-------|-----------------|----------------------------------|
119 | 000 | from `FPSCR` | [OpenPower semantics] |
120 | 001 | Truncate | [OpenPower semantics] |
121 | 010 | from `FPSCR` | [Java/Saturating semantics] |
122 | 011 | Truncate | [Java/Saturating semantics] |
123 | 100 | from `FPSCR` | [JavaScript semantics] |
124 | 101 | Truncate | [JavaScript semantics] |
125 | rest | -- | illegal instruction trap for now |
126
127 [OpenPower semantics]: #fp-to-int-openpower-conversion-semantics
128 [Java/Saturating semantics]: #fp-to-int-java-saturating-conversion-semantics
129 [JavaScript semantics]: #fp-to-int-javascript-conversion-semantics
130
131 ----------
132
133 ## Floating Move To GPR
134
135 ```
136 fmvtg RT, FRB
137 fmvtg. RT, FRB
138 ```
139
140 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form |
141 |-----|------|-------|-------|-------|----|--------|
142 | PO | RT | 0 | FRB | XO | Rc | X-Form |
143
144 ```
145 RT <- (FRB)
146 ```
147
148 Move a 64-bit float from a FPR to a GPR, just copying bits of the IEEE 754
149 representation directly. This is equivalent to `stfd` followed by `ld`.
150 As `fmvtg` is just copying bits, `FPSCR` is not affected in any way.
151
152 Rc=1 tests RT and sets CR0, exactly like all other Scalar Fixed-Point
153 operations.
154
155 Special Registers altered:
156
157 CR0 (if Rc=1)
158
159 ----------
160
161 ## Floating Move To GPR Single
162
163 ```
164 fmvtgs RT, FRB
165 fmvtgs. RT, FRB
166 ```
167
168 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form |
169 |-----|------|-------|-------|-------|----|--------|
170 | PO | RT | 0 | FRB | XO | Rc | X-Form |
171
172 ```
173 RT <- [0] * 32 || SINGLE((FRB)) # SINGLE since that's what stfs uses
174 ```
175
176 Move a 32-bit float from a FPR to a GPR, just copying bits of the IEEE 754
177 representation directly. This is equivalent to `stfs` followed by `lwz`.
178 As `fmvtgs` is just copying bits, `FPSCR` is not affected in any way.
179
180 Rc=1 tests RT and sets CR0, exactly like all other Scalar Fixed-Point
181 operations.
182
183 Special Registers altered:
184
185 CR0 (if Rc=1)
186
187 ----------
188
189 \newpage{}
190
191 ## Double-Precision Floating Move From GPR
192
193 ```
194 fmvfg FRT, RB
195 fmvfg. FRT, RB
196 ```
197
198 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form |
199 |-----|------|-------|-------|-------|----|--------|
200 | PO | FRT | 0 | RB | XO | Rc | X-Form |
201
202 ```
203 FRT <- (RB)
204 ```
205
206 move a 64-bit float from a GPR to a FPR, just copying bits of the IEEE 754
207 representation directly. This is equivalent to `std` followed by `lfd`.
208 As `fmvfg` is just copying bits, `FPSCR` is not affected in any way.
209
210 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
211 operations.
212
213 Special Registers altered:
214
215 CR1 (if Rc=1)
216
217 ----------
218
219 ## Floating Move From GPR Single
220
221 ```
222 fmvfgs FRT, RB
223 fmvfgs. FRT, RB
224 ```
225
226 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form |
227 |-----|------|-------|-------|-------|----|--------|
228 | PO | FRT | 0 | RB | XO | Rc | X-Form |
229
230 ```
231 FRT <- DOUBLE((RB)[32:63]) # DOUBLE since that's what lfs uses
232 ```
233
234 move a 32-bit float from a GPR to a FPR, just copying bits of the IEEE 754
235 representation directly. This is equivalent to `stw` followed by `lfs`.
236 As `fmvfgs` is just copying bits, `FPSCR` is not affected in any way.
237
238 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
239 operations.
240
241 Special Registers altered:
242
243 CR1 (if Rc=1)
244
245 ----------
246
247 \newpage{}
248
249 ## Double-Precision Floating Convert From Integer In GPR
250
251 ```
252 fcvtfg FRT, RB, IT
253 fcvtfg. FRT, RB, IT
254 ```
255
256 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-30 | 31 | Form |
257 |-----|------|-------|-------|-------|-------|----|--------|
258 | PO | FRT | IT | 0 | RB | XO | Rc | X-Form |
259
260 ```
261 if IT[0] = 0 then # 32-bit int -> 64-bit float
262 # rounding never necessary, so don't touch FPSCR
263 # based off xvcvsxwdp
264 if IT = 0 then # Signed 32-bit
265 src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
266 else # IT = 1 -- Unsigned 32-bit
267 src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
268 FRT <- bfp64_CONVERT_FROM_BFP(src)
269 else
270 # rounding may be necessary. based off xscvuxdsp
271 reset_xflags()
272 switch(IT)
273 case(0): # Signed 32-bit
274 src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
275 case(1): # Unsigned 32-bit
276 src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
277 case(2): # Signed 64-bit
278 src <- bfp_CONVERT_FROM_SI64((RB))
279 default: # Unsigned 64-bit
280 src <- bfp_CONVERT_FROM_UI64((RB))
281 rnd <- bfp_ROUND_TO_BFP64(FPSCR.RN, src)
282 result <- bfp64_CONVERT_FROM_BFP(rnd)
283 cls <- fprf_CLASS_BFP64(result)
284
285 if xx_flag = 1 then SetFX(FPSCR.XX)
286
287 FRT <- result
288 FPSCR.FPRF <- cls
289 FPSCR.FR <- inc_flag
290 FPSCR.FI <- xx_flag
291 ```
292 <!-- note the PowerISA spec. explicitly has empty lines before/after SetFX,
293 don't remove them -->
294
295 Convert from a unsigned/signed 32/64-bit integer in RB to a 64-bit
296 float in FRT.
297
298 If converting from a unsigned/signed 32-bit integer to a 64-bit float,
299 rounding is never necessary, so `FPSCR` is unmodified and exceptions are
300 never raised. Otherwise, `FPSCR` is modified and exceptions are raised
301 as usual.
302
303 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
304 operations.
305
306 Special Registers altered:
307
308 CR1 (if Rc=1)
309 FPCSR (TODO: which bits?) (if IT[0]=1)
310
311 ### Assembly Aliases
312
313 | Assembly Alias | Full Instruction |&nbsp;| Assembly Alias | Full Instruction |
314 |----------------------|----------------------|------|----------------------|----------------------|
315 | `fcvtfgw FRT, RB` | `fcvtfg FRT, RB, 0` |&nbsp;| `fcvtfgd FRT, RB` | `fcvtfg FRT, RB, 2` |
316 | `fcvtfgw. FRT, RB` | `fcvtfg. FRT, RB, 0` |&nbsp;| `fcvtfgd. FRT, RB` | `fcvtfg. FRT, RB, 2` |
317 | `fcvtfguw FRT, RB` | `fcvtfg FRT, RB, 1` |&nbsp;| `fcvtfgud FRT, RB` | `fcvtfg FRT, RB, 3` |
318 | `fcvtfguw. FRT, RB` | `fcvtfg. FRT, RB, 1` |&nbsp;| `fcvtfgud. FRT, RB` | `fcvtfg. FRT, RB, 3` |
319
320 ----------
321
322 \newpage{}
323
324 ## Floating Convert From Integer In GPR Single
325
326 ```
327 fcvtfgs FRT, RB, IT
328 fcvtfgs. FRT, RB, IT
329 ```
330
331 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-30 | 31 | Form |
332 |-----|------|-------|-------|-------|-------|----|--------|
333 | PO | FRT | IT | 0 | RB | XO | Rc | X-Form |
334
335 ```
336 # rounding may be necessary. based off xscvuxdsp
337 reset_xflags()
338 switch(IT)
339 case(0): # Signed 32-bit
340 src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
341 case(1): # Unsigned 32-bit
342 src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
343 case(2): # Signed 64-bit
344 src <- bfp_CONVERT_FROM_SI64((RB))
345 default: # Unsigned 64-bit
346 src <- bfp_CONVERT_FROM_UI64((RB))
347 rnd <- bfp_ROUND_TO_BFP32(FPSCR.RN, src)
348 result32 <- bfp32_CONVERT_FROM_BFP(rnd)
349 cls <- fprf_CLASS_BFP32(result32)
350 result <- DOUBLE(result32)
351
352 if xx_flag = 1 then SetFX(FPSCR.XX)
353
354 FRT <- result
355 FPSCR.FPRF <- cls
356 FPSCR.FR <- inc_flag
357 FPSCR.FI <- xx_flag
358 ```
359 <!-- note the PowerISA spec. explicitly has empty lines before/after SetFX,
360 don't remove them -->
361
362 Convert from a unsigned/signed 32/64-bit integer in RB to a 32-bit
363 float in FRT, following the usual 32-bit float in 64-bit float format.
364 `FPSCR` is modified and exceptions are raised as usual.
365
366 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
367 operations.
368
369 Special Registers altered:
370
371 CR1 (if Rc=1)
372 FPCSR (TODO: which bits?)
373
374 ### Assembly Aliases
375
376 | Assembly Alias | Full Instruction |&nbsp;| Assembly Alias | Full Instruction |
377 |----------------------|----------------------|------|----------------------|----------------------|
378 | `fcvtfgws FRT, RB` | `fcvtfg FRT, RB, 0` |&nbsp;| `fcvtfgds FRT, RB` | `fcvtfg FRT, RB, 2` |
379 | `fcvtfgws. FRT, RB` | `fcvtfg. FRT, RB, 0` |&nbsp;| `fcvtfgds. FRT, RB` | `fcvtfg. FRT, RB, 2` |
380 | `fcvtfguws FRT, RB` | `fcvtfg FRT, RB, 1` |&nbsp;| `fcvtfguds FRT, RB` | `fcvtfg FRT, RB, 3` |
381 | `fcvtfguws. FRT, RB` | `fcvtfg. FRT, RB, 1` |&nbsp;| `fcvtfguds. FRT, RB` | `fcvtfg. FRT, RB, 3` |
382
383 ----------
384
385 \newpage{}
386
387 ## Floating-point to Integer Conversion Overview
388
389 <div id="fpr-to-gpr-conversion-mode"></div>
390
391 IEEE 754 doesn't specify what results are obtained when converting a NaN
392 or out-of-range floating-point value to integer, so different programming
393 languages and ISAs have made different choices. Below is an overview
394 of the different variants, listing the languages and hardware that
395 implements each variant.
396
397 For convenience, we will give those different conversion semantics names
398 based on which common ISA or programming language uses them, since there
399 may not be an established name for them:
400
401 **Standard OpenPower conversion**
402
403 This conversion performs "saturation with NaN converted to minimum
404 valid integer". This is also exactly the same as the x86 ISA conversion
405 semantics. OpenPOWER however has instructions for both:
406
407 * rounding mode read from FPSCR
408 * rounding mode always set to truncate
409
410 **Java/Saturating conversion**
411
412 For the sake of simplicity, the FP -> Integer conversion semantics
413 generalized from those used by Java's semantics (and Rust's `as`
414 operator) will be referred to as [Java/Saturating conversion
415 semantics](#fp-to-int-java-saturating-conversion-semantics).
416
417 Those same semantics are used in some way by all of the following
418 languages (not necessarily for the default conversion method):
419
420 * Java's
421 [FP -> Integer conversion](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
422 (only for long/int results)
423 * Rust's FP -> Integer conversion using the
424 [`as` operator](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
425 * LLVM's
426 [`llvm.fptosi.sat`](https://llvm.org/docs/LangRef.html#llvm-fptosi-sat-intrinsic) and
427 [`llvm.fptoui.sat`](https://llvm.org/docs/LangRef.html#llvm-fptoui-sat-intrinsic) intrinsics
428 * SPIR-V's OpenCL dialect's
429 [`OpConvertFToU`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToU) and
430 [`OpConvertFToS`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToS)
431 instructions when decorated with
432 [the `SaturatedConversion` decorator](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#_a_id_decoration_a_decoration).
433 * WebAssembly has also introduced
434 [trunc_sat_u](ttps://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-u) and
435 [trunc_sat_s](https://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-s)
436
437 **JavaScript conversion**
438
439 For the sake of simplicity, the FP -> Integer conversion
440 semantics generalized from those used by JavaScripts's `ToInt32`
441 abstract operation will be referred to as [JavaScript conversion
442 semantics](#fp-to-int-javascript-conversion-semantics).
443
444 This instruction is present in ARM assembler as FJCVTZS
445 <https://developer.arm.com/documentation/dui0801/g/hko1477562192868>
446
447 **Rc=1 and OE=1**
448
449 All of these instructions have an Rc=1 mode which sets CR0
450 in the normal way for any instructions producing a GPR result.
451 Additionally, when OE=1, if the numerical value of the FP number
452 is not 100% accurately preserved (due to truncation or saturation
453 and including when the FP number was NaN) then this is considered
454 to be an integer Overflow condition, and CR0.SO, XER.SO and XER.OV
455 are all set as normal for any GPR instructions that overflow.
456
457 \newpage{}
458
459 ### FP to Integer Conversion Simplified Pseudo-code
460
461 Key for pseudo-code:
462
463 | term | result type | definition |
464 |---------------------------|-------------|----------------------------------------------------------------------------------------------------|
465 | `fp` | -- | `f32` or `f64` (or other types from SimpleV) |
466 | `int` | -- | `u32`/`u64`/`i32`/`i64` (or other types from SimpleV) |
467 | `uint` | -- | the unsigned integer of the same bit-width as `int` |
468 | `int::BITS` | `int` | the bit-width of `int` |
469 | `uint::MIN_VALUE` | `uint` | the minimum value `uint` can store: `0` |
470 | `uint::MAX_VALUE` | `uint` | the maximum value `uint` can store: `2^int::BITS - 1` |
471 | `int::MIN_VALUE` | `int` | the minimum value `int` can store : `-2^(int::BITS-1)` |
472 | `int::MAX_VALUE` | `int` | the maximum value `int` can store : `2^(int::BITS-1) - 1` |
473 | `int::VALUE_COUNT` | Integer | the number of different values `int` can store (`2^int::BITS`). too big to fit in `int`. |
474 | `rint(fp, rounding_mode)` | `fp` | rounds the floating-point value `fp` to an integer according to rounding mode `rounding_mode` |
475
476 <div id="fp-to-int-openpower-conversion-semantics"></div>
477 OpenPower conversion semantics (section A.2 page 1009 (page 1035) of
478 Power ISA v3.1B):
479
480 ```
481 def fp_to_int_open_power<fp, int>(v: fp) -> int:
482 if v is NaN:
483 return int::MIN_VALUE
484 if v >= int::MAX_VALUE:
485 return int::MAX_VALUE
486 if v <= int::MIN_VALUE:
487 return int::MIN_VALUE
488 return (int)rint(v, rounding_mode)
489 ```
490
491 <div id="fp-to-int-java-saturating-conversion-semantics"></div>
492 [Java/Saturating conversion semantics](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
493 (only for long/int results)/
494 [Rust semantics](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
495 (with adjustment to add non-truncate rounding modes):
496
497 ```
498 def fp_to_int_java_saturating<fp, int>(v: fp) -> int:
499 if v is NaN:
500 return 0
501 if v >= int::MAX_VALUE:
502 return int::MAX_VALUE
503 if v <= int::MIN_VALUE:
504 return int::MIN_VALUE
505 return (int)rint(v, rounding_mode)
506 ```
507
508 <div id="fp-to-int-javascript-conversion-semantics"></div>
509 Section 7.1 of the ECMAScript / JavaScript
510 [conversion semantics](https://262.ecma-international.org/11.0/#sec-toint32)
511 (with adjustment to add non-truncate rounding modes):
512
513 ```
514 def fp_to_int_java_script<fp, int>(v: fp) -> int:
515 if v is NaN or infinite:
516 return 0
517 v = rint(v, rounding_mode) # assume no loss of precision in result
518 v = v mod int::VALUE_COUNT # 2^32 for i32, 2^64 for i64, result is non-negative
519 bits = (uint)v
520 return (int)bits
521 ```
522
523
524 ----------
525
526 \newpage{}
527
528
529 ## Double-Precision Floating Convert To Integer In GPR
530
531 ```
532 fcvttg RT, FRB, CVM, IT
533 fcvttg. RT, FRB, CVM, IT
534 fcvttgo RT, FRB, CVM, IT
535 fcvttgo. RT, FRB, CVM, IT
536 ```
537
538 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-29 | 30 | 31 | Form |
539 |-----|------|-------|-------|-------|-------|----|----|---------|
540 | PO | RT | IT | CVM | FRB | XO | OE | Rc | XO-Form |
541
542 ```
543 # based on xscvdpuxws
544 reset_xflags()
545 src <- bfp_CONVERT_FROM_BFP64((FRB))
546
547 switch(IT)
548 case(0): # Signed 32-bit
549 range_min <- bfp_CONVERT_FROM_SI32(0x8000_0000)
550 range_max <- bfp_CONVERT_FROM_SI32(0x7FFF_FFFF)
551 js_mask <- 0xFFFF_FFFF
552 case(1): # Unsigned 32-bit
553 range_min <- bfp_CONVERT_FROM_UI32(0)
554 range_max <- bfp_CONVERT_FROM_UI32(0xFFFF_FFFF)
555 js_mask <- 0xFFFF_FFFF
556 case(2): # Signed 64-bit
557 range_min <- bfp_CONVERT_FROM_SI64(-0x8000_0000_0000_0000)
558 range_max <- bfp_CONVERT_FROM_SI64(0x7FFF_FFFF_FFFF_FFFF)
559 js_mask <- 0xFFFF_FFFF_FFFF_FFFF
560 default: # Unsigned 64-bit
561 range_min <- bfp_CONVERT_FROM_UI64(0)
562 range_max <- bfp_CONVERT_FROM_UI64(0xFFFF_FFFF_FFFF_FFFF)
563 js_mask <- 0xFFFF_FFFF_FFFF_FFFF
564
565 if CVM[2] = 1 or FPSCR.RN = 0b01 then
566 rnd <- bfp_ROUND_TO_INTEGER_TRUNC(src)
567 else if FPSCR.RN = 0b00 then
568 rnd <- bfp_ROUND_TO_INTEGER_NEAR_EVEN(src)
569 else if FPSCR.RN = 0b10 then
570 rnd <- bfp_ROUND_TO_INTEGER_CEIL(src)
571 else if FPSCR.RN = 0b11 then
572 rnd <- bfp_ROUND_TO_INTEGER_FLOOR(src)
573
574 switch(CVM)
575 case(0, 1): # OpenPower semantics
576 if IsNaN(rnd) then
577 result <- si64_CONVERT_FROM_BFP(range_min)
578 else if bfp_COMPARE_GT(rnd, range_max) then
579 result <- ui64_CONVERT_FROM_BFP(range_max)
580 else if bfp_COMPARE_LT(rnd, range_min) then
581 result <- si64_CONVERT_FROM_BFP(range_min)
582 else if IT[1] = 1 then # Unsigned 32/64-bit
583 result <- ui64_CONVERT_FROM_BFP(range_max)
584 else # Signed 32/64-bit
585 result <- si64_CONVERT_FROM_BFP(range_max)
586 case(2, 3): # Java/Saturating semantics
587 if IsNaN(rnd) then
588 result <- [0] * 64
589 else if bfp_COMPARE_GT(rnd, range_max) then
590 result <- ui64_CONVERT_FROM_BFP(range_max)
591 else if bfp_COMPARE_LT(rnd, range_min) then
592 result <- si64_CONVERT_FROM_BFP(range_min)
593 else if IT[1] = 1 then # Unsigned 32/64-bit
594 result <- ui64_CONVERT_FROM_BFP(range_max)
595 else # Signed 32/64-bit
596 result <- si64_CONVERT_FROM_BFP(range_max)
597 default: # JavaScript semantics
598 # CVM = 6, 7 are illegal instructions
599 # this works because the largest type we try to convert from has
600 # 53 significand bits, and the largest type we try to convert to
601 # has 64 bits, and the sum of those is strictly less than the 128
602 # bits of the intermediate result.
603 limit <- bfp_CONVERT_FROM_UI128([1] * 128)
604 if IsInf(rnd) or IsNaN(rnd) then
605 result <- [0] * 64
606 else if bfp_COMPARE_GT(bfp_ABSOLUTE(rnd), limit) then
607 result <- [0] * 64
608 else
609 result128 <- si128_CONVERT_FROM_BFP(rnd)
610 result <- result128[64:127] & js_mask
611
612 switch(IT)
613 case(0): # Signed 32-bit
614 result <- EXTS64(result[32:63])
615 result_bfp <- bfp_CONVERT_FROM_SI32(result[32:63])
616 case(1): # Unsigned 32-bit
617 result <- EXTZ64(result[32:63])
618 result_bfp <- bfp_CONVERT_FROM_UI32(result[32:63])
619 case(2): # Signed 64-bit
620 result_bfp <- bfp_CONVERT_FROM_SI64(result)
621 default: # Unsigned 64-bit
622 result_bfp <- bfp_CONVERT_FROM_UI64(result)
623
624 if vxsnan_flag = 1 then SetFX(FPSCR.VXSNAN)
625 if vxcvi_flag = 1 then SetFX(FPSCR.VXCVI)
626 if xx_flag = 1 then SetFX(FPSCR.XX)
627
628 vx_flag <- vxsnan_flag | vxcvi_flag
629 vex_flag <- FPSCR.VE & vx_flag
630
631 if vex_flag = 0 then
632 RT <- result
633 FPSCR.FPRF <- undefined
634 FPSCR.FR <- inc_flag
635 FPSCR.FI <- xx_flag
636 if IsNaN(src) or not bfp_COMPARE_EQ(src, result_bfp) then
637 overflow <- 1 # signals SO only when OE = 1
638 else
639 FPSCR.FR <- 0
640 FPSCR.FI <- 0
641 ```
642
643 Convert from 64-bit float in FRB to a unsigned/signed 32/64-bit integer
644 in RT, with the conversion overflow/rounding semantics following the
645 chosen `CVM` value. `FPSCR` is modified and exceptions are raised as usual.
646
647 These instructions have an Rc=1 mode which sets CR0 in the normal
648 way for any instructions producing a GPR result. Additionally, when OE=1,
649 if the numerical value of the FP number is not 100% accurately preserved
650 (due to truncation or saturation and including when the FP number was
651 NaN) then this is considered to be an Integer Overflow condition, and
652 CR0.SO, XER.SO and XER.OV are all set as normal for any GPR instructions
653 that overflow.
654
655 Special Registers altered:
656
657 CR0 (if Rc=1)
658 XER SO, OV, OV32 (if OE=1)
659 FPCSR (TODO: which bits?)
660
661 ### Assembly Aliases
662
663 | Assembly Alias | Full Instruction | Assembly Alias | Full Instruction |
664 |---------------------------|----------------------------|---------------------------|----------------------------|
665 | `fcvttgw RT, FRB, CVM` | `fcvttg RT, FRB, CVM, 0` | `fcvttgd RT, FRB, CVM` | `fcvttg RT, FRB, CVM, 2` |
666 | `fcvttgw. RT, FRB, CVM` | `fcvttg. RT, FRB, CVM, 0` | `fcvttgd. RT, FRB, CVM` | `fcvttg. RT, FRB, CVM, 2` |
667 | `fcvttgwo RT, FRB, CVM` | `fcvttgo RT, FRB, CVM, 0` | `fcvttgdo RT, FRB, CVM` | `fcvttgo RT, FRB, CVM, 2` |
668 | `fcvttgwo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 0` | `fcvttgdo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 2` |
669 | `fcvttguw RT, FRB, CVM` | `fcvttg RT, FRB, CVM, 1` | `fcvttgud RT, FRB, CVM` | `fcvttg RT, FRB, CVM, 3` |
670 | `fcvttguw. RT, FRB, CVM` | `fcvttg. RT, FRB, CVM, 1` | `fcvttgud. RT, FRB, CVM` | `fcvttg. RT, FRB, CVM, 3` |
671 | `fcvttguwo RT, FRB, CVM` | `fcvttgo RT, FRB, CVM, 1` | `fcvttgudo RT, FRB, CVM` | `fcvttgo RT, FRB, CVM, 3` |
672 | `fcvttguwo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 1` | `fcvttgudo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 3` |
673
674 ----------
675
676 \newpage{}
677
678 ## Floating Convert Single To Integer In GPR
679
680 ```
681 fcvtstg RT, FRB, CVM, IT
682 fcvtstg. RT, FRB, CVM, IT
683 fcvtstgo RT, FRB, CVM, IT
684 fcvtstgo. RT, FRB, CVM, IT
685 ```
686
687 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-29 | 30 | 31 | Form |
688 |-----|------|-------|-------|-------|-------|----|----|---------|
689 | PO | RT | IT | CVM | FRB | XO | OE | Rc | XO-Form |
690
691 ```
692 # based on xscvdpuxws
693 reset_xflags()
694 src <- bfp_CONVERT_FROM_BFP32(SINGLE((FRB)))
695
696 switch(IT)
697 case(0): # Signed 32-bit
698 range_min <- bfp_CONVERT_FROM_SI32(0x8000_0000)
699 range_max <- bfp_CONVERT_FROM_SI32(0x7FFF_FFFF)
700 js_mask <- 0xFFFF_FFFF
701 case(1): # Unsigned 32-bit
702 range_min <- bfp_CONVERT_FROM_UI32(0)
703 range_max <- bfp_CONVERT_FROM_UI32(0xFFFF_FFFF)
704 js_mask <- 0xFFFF_FFFF
705 case(2): # Signed 64-bit
706 range_min <- bfp_CONVERT_FROM_SI64(-0x8000_0000_0000_0000)
707 range_max <- bfp_CONVERT_FROM_SI64(0x7FFF_FFFF_FFFF_FFFF)
708 js_mask <- 0xFFFF_FFFF_FFFF_FFFF
709 default: # Unsigned 64-bit
710 range_min <- bfp_CONVERT_FROM_UI64(0)
711 range_max <- bfp_CONVERT_FROM_UI64(0xFFFF_FFFF_FFFF_FFFF)
712 js_mask <- 0xFFFF_FFFF_FFFF_FFFF
713
714 if CVM[2] = 1 or FPSCR.RN = 0b01 then
715 rnd <- bfp_ROUND_TO_INTEGER_TRUNC(src)
716 else if FPSCR.RN = 0b00 then
717 rnd <- bfp_ROUND_TO_INTEGER_NEAR_EVEN(src)
718 else if FPSCR.RN = 0b10 then
719 rnd <- bfp_ROUND_TO_INTEGER_CEIL(src)
720 else if FPSCR.RN = 0b11 then
721 rnd <- bfp_ROUND_TO_INTEGER_FLOOR(src)
722
723 switch(CVM)
724 case(0, 1): # OpenPower semantics
725 if IsNaN(rnd) then
726 result <- si64_CONVERT_FROM_BFP(range_min)
727 else if bfp_COMPARE_GT(rnd, range_max) then
728 result <- ui64_CONVERT_FROM_BFP(range_max)
729 else if bfp_COMPARE_LT(rnd, range_min) then
730 result <- si64_CONVERT_FROM_BFP(range_min)
731 else if IT[1] = 1 then # Unsigned 32/64-bit
732 result <- ui64_CONVERT_FROM_BFP(range_max)
733 else # Signed 32/64-bit
734 result <- si64_CONVERT_FROM_BFP(range_max)
735 case(2, 3): # Java/Saturating semantics
736 if IsNaN(rnd) then
737 result <- [0] * 64
738 else if bfp_COMPARE_GT(rnd, range_max) then
739 result <- ui64_CONVERT_FROM_BFP(range_max)
740 else if bfp_COMPARE_LT(rnd, range_min) then
741 result <- si64_CONVERT_FROM_BFP(range_min)
742 else if IT[1] = 1 then # Unsigned 32/64-bit
743 result <- ui64_CONVERT_FROM_BFP(range_max)
744 else # Signed 32/64-bit
745 result <- si64_CONVERT_FROM_BFP(range_max)
746 default: # JavaScript semantics
747 # CVM = 6, 7 are illegal instructions
748 # this works because the largest type we try to convert from has
749 # 53 significand bits, and the largest type we try to convert to
750 # has 64 bits, and the sum of those is strictly less than the 128
751 # bits of the intermediate result.
752 limit <- bfp_CONVERT_FROM_UI128([1] * 128)
753 if IsInf(rnd) or IsNaN(rnd) then
754 result <- [0] * 64
755 else if bfp_COMPARE_GT(bfp_ABSOLUTE(rnd), limit) then
756 result <- [0] * 64
757 else
758 result128 <- si128_CONVERT_FROM_BFP(rnd)
759 result <- result128[64:127] & js_mask
760
761 switch(IT)
762 case(0): # Signed 32-bit
763 result <- EXTS64(result[32:63])
764 result_bfp <- bfp_CONVERT_FROM_SI32(result[32:63])
765 case(1): # Unsigned 32-bit
766 result <- EXTZ64(result[32:63])
767 result_bfp <- bfp_CONVERT_FROM_UI32(result[32:63])
768 case(2): # Signed 64-bit
769 result_bfp <- bfp_CONVERT_FROM_SI64(result)
770 default: # Unsigned 64-bit
771 result_bfp <- bfp_CONVERT_FROM_UI64(result)
772
773 if vxsnan_flag = 1 then SetFX(FPSCR.VXSNAN)
774 if vxcvi_flag = 1 then SetFX(FPSCR.VXCVI)
775 if xx_flag = 1 then SetFX(FPSCR.XX)
776
777 vx_flag <- vxsnan_flag | vxcvi_flag
778 vex_flag <- FPSCR.VE & vx_flag
779
780 if vex_flag = 0 then
781 RT <- result
782 FPSCR.FPRF <- undefined
783 FPSCR.FR <- inc_flag
784 FPSCR.FI <- xx_flag
785 if IsNaN(src) or not bfp_COMPARE_EQ(src, result_bfp) then
786 overflow <- 1 # signals SO only when OE = 1
787 else
788 FPSCR.FR <- 0
789 FPSCR.FI <- 0
790 ```
791
792 Convert from 32-bit float in FRB to a unsigned/signed 32/64-bit integer
793 in RT, with the conversion overflow/rounding semantics following the
794 chosen `CVM` value, following the usual 32-bit float in 64-bit float
795 format. `FPSCR` is modified and exceptions are raised as usual.
796
797 These instructions have an Rc=1 mode which sets CR0 in the normal
798 way for any instructions producing a GPR result. Additionally, when OE=1,
799 if the numerical value of the FP number is not 100% accurately preserved
800 (due to truncation or saturation and including when the FP number was
801 NaN) then this is considered to be an Integer Overflow condition, and
802 CR0.SO, XER.SO and XER.OV are all set as normal for any GPR instructions
803 that overflow.
804
805 Special Registers altered:
806
807 CR0 (if Rc=1)
808 XER SO, OV, OV32 (if OE=1)
809 FPCSR (TODO: which bits?)
810
811 ### Assembly Aliases
812
813 | Assembly Alias | Full Instruction | Assembly Alias | Full Instruction |
814 |----------------------------|-----------------------------|----------------------------|-----------------------------|
815 | `fcvtstgw RT, FRB, CVM` | `fcvtstg RT, FRB, CVM, 0` | `fcvtstgd RT, FRB, CVM` | `fcvtstg RT, FRB, CVM, 2` |
816 | `fcvtstgw. RT, FRB, CVM` | `fcvtstg. RT, FRB, CVM, 0` | `fcvtstgd. RT, FRB, CVM` | `fcvtstg. RT, FRB, CVM, 2` |
817 | `fcvtstgwo RT, FRB, CVM` | `fcvtstgo RT, FRB, CVM, 0` | `fcvtstgdo RT, FRB, CVM` | `fcvtstgo RT, FRB, CVM, 2` |
818 | `fcvtstgwo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 0` | `fcvtstgdo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 2` |
819 | `fcvtstguw RT, FRB, CVM` | `fcvtstg RT, FRB, CVM, 1` | `fcvtstgud RT, FRB, CVM` | `fcvtstg RT, FRB, CVM, 3` |
820 | `fcvtstguw. RT, FRB, CVM` | `fcvtstg. RT, FRB, CVM, 1` | `fcvtstgud. RT, FRB, CVM` | `fcvtstg. RT, FRB, CVM, 3` |
821 | `fcvtstguwo RT, FRB, CVM` | `fcvtstgo RT, FRB, CVM, 1` | `fcvtstgudo RT, FRB, CVM` | `fcvtstgo RT, FRB, CVM, 3` |
822 | `fcvtstguwo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 1` | `fcvtstgudo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 3` |
823
824 ----------
825
826 \newpage{}
827
828 ----------
829
830 # Appendices
831
832 Appendix E Power ISA sorted by opcode
833 Appendix F Power ISA sorted by version
834 Appendix G Power ISA sorted by Compliancy Subset
835 Appendix H Power ISA sorted by mnemonic
836
837 |Form| Book | Page | Version | mnemonic | Description |
838 |----|------|------|---------|----------|-------------|
839 |VA | I | # | 3.2B |todo | |
840
841 ----------------
842
843 [[!tag opf_rfc]]