748159182af54afa621e07f0135ea39da3ba2ca1
[libreriscv.git] / openpower / sv / rfc / ls006.mdwn
1 # RFC ls006 FPR <-> GPR Move/Conversion
2
3 **URLs**:
4
5 * <https://libre-soc.org/openpower/sv/int_fp_mv/>
6 * <https://libre-soc.org/openpower/sv/rfc/ls006/>
7 * <https://bugs.libre-soc.org/show_bug.cgi?id=1015>
8 * <https://git.openpower.foundation/isa/PowerISA/issues/todo>
9
10 **Severity**: Major
11
12 **Status**: New
13
14 **Date**: 20 Oct 2022
15
16 **Target**: v3.2B
17
18 **Source**: v3.1B
19
20 **Books and Section affected**: **UPDATE**
21
22 * Book I 4.6.5 Floating-Point Move Instructions
23 * Book I 4.6.7.2 Floating-Point Convert To/From Integer Instructions
24 * Appendix E Power ISA sorted by opcode
25 * Appendix F Power ISA sorted by version
26 * Appendix G Power ISA sorted by Compliancy Subset
27 * Appendix H Power ISA sorted by mnemonic
28
29 **Summary**
30
31 Instructions added
32
33 * `fmvtg` -- Floating Move To GPR
34 * `fmvfg` -- Floating Move From GPR
35 * `fcvttg` -- Floating Convert To Integer In GPR
36 * `fcvtfg` -- Floating Convert From Integer In GPR
37
38 **Submitter**: Luke Leighton (Libre-SOC)
39
40 **Requester**: Libre-SOC
41
42 **Impact on processor**:
43
44 * Addition of five new GPR-FPR-based instructions
45
46 **Impact on software**:
47
48 * Requires support for new instructions in assembler, debuggers,
49 and related tools.
50
51 **Keywords**:
52
53 ```
54 GPR, FPR, Move, Conversion, JavaScript
55 ```
56
57 **Motivation**
58
59 CPUs without VSX/VMX lack a way to efficiently transfer data between
60 FPRs and GPRs, they need to go through memory, this proposal adds more
61 efficient data transfer (both bitwise copy and Integer <-> FP conversion)
62 instructions that transfer directly between FPRs and GPRs without needing
63 to go through memory.
64
65 IEEE 754 doesn't specify what results are obtained when converting a NaN
66 or out-of-range floating-point value to integer, so different programming
67 languages and ISAs have made different choices. Below is an overview
68 of the different variants, listing the languages and hardware that
69 implements each variant.
70
71 **Notes and Observations**:
72
73 * These instructions are present in many other ISAs.
74 * JavaScript rounding as one instruction saves 35 instructions including
75 six branches. (FIXME: disagrees with int_fp_mv and int_fp_mv/appendix)
76
77 **Changes**
78
79 Add the following entries to:
80
81 * Book I 4.6.5 Floating-Point Move Instructions
82 * Book I 4.6.7.2 Floating-Point Convert To/From Integer Instructions
83 * Book I 1.6.1 and 1.6.2
84
85 ----------------
86
87 \newpage{}
88
89 # Immediate Tables
90
91 Tables that are used by
92 `fmvtg[s][.]`/`fmvfg[s][.]`/`fcvttg[s][.]`/`fcvtfg[s][.]`:
93
94 ## `RCS` -- `Rc` and `s`
95
96 | `RCS` | `Rc` | FP Single Mode | Assembly Alias Mnemonic |
97 |-------|------|----------------|-------------------------|
98 | 0 | 0 | Double | `<op>` |
99 | 1 | 1 | Double | `<op>.` |
100 | 2 | 0 | Single | `<op>s` |
101 | 3 | 1 | Single | `<op>s.` |
102
103 ## `IT` -- Integer Type
104
105 | `IT` | Integer Type | Assembly Alias Mnemonic |
106 |------|-----------------|-------------------------|
107 | 0 | Signed 32-bit | `<op>w` |
108 | 1 | Unsigned 32-bit | `<op>uw` |
109 | 2 | Signed 64-bit | `<op>d` |
110 | 3 | Unsigned 64-bit | `<op>ud` |
111
112 ## `CVM` -- Float to Integer Conversion Mode
113
114 | `CVM` | `rounding_mode` | Semantics |
115 |-------|-----------------|----------------------------------|
116 | 000 | from `FPSCR` | [OpenPower semantics] |
117 | 001 | Truncate | [OpenPower semantics] |
118 | 010 | from `FPSCR` | [Java/Saturating semantics] |
119 | 011 | Truncate | [Java/Saturating semantics] |
120 | 100 | from `FPSCR` | [JavaScript semantics] |
121 | 101 | Truncate | [JavaScript semantics] |
122 | rest | -- | illegal instruction trap for now |
123
124 [OpenPower semantics]: #fp-to-int-openpower-conversion-semantics
125 [Java/Saturating semantics]: #fp-to-int-java-saturating-conversion-semantics
126 [JavaScript semantics]: #fp-to-int-javascript-conversion-semantics
127
128 ----------
129
130 \newpage{}
131
132 ## Floating Move To GPR
133
134 ```
135 fmvtg RT, FRB
136 fmvtg. RT, FRB
137 ```
138
139 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form |
140 |-----|------|-------|-------|-------|----|--------|
141 | PO | RT | 0 | FRB | XO | Rc | X-Form |
142
143 ```
144 RT <- (FRB)
145 ```
146
147 Move a 64-bit float from a FPR to a GPR, just copying bits of the IEEE 754
148 representation directly. This is equivalent to `stfd` followed by `ld`.
149 As `fmvtg` is just copying bits, `FPSCR` is not affected in any way.
150
151 Rc=1 tests RT and sets CR0, exactly like all other Scalar Fixed-Point
152 operations.
153
154 Special Registers altered:
155
156 CR0 (if Rc=1)
157
158 ----------
159
160 \newpage{}
161
162 ## Floating Move To GPR Single
163
164 ```
165 fmvtgs RT, FRB
166 fmvtgs. RT, FRB
167 ```
168
169 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form |
170 |-----|------|-------|-------|-------|----|--------|
171 | PO | RT | 0 | FRB | XO | Rc | X-Form |
172
173 ```
174 RT <- [0] * 32 || SINGLE((FRB)) # SINGLE since that's what stfs uses
175 ```
176
177 Move a 32-bit float from a FPR to a GPR, just copying bits of the IEEE 754
178 representation directly. This is equivalent to `stfs` followed by `lwz`.
179 As `fmvtgs` is just copying bits, `FPSCR` is not affected in any way.
180
181 Rc=1 tests RT and sets CR0, exactly like all other Scalar Fixed-Point
182 operations.
183
184 Special Registers altered:
185
186 CR0 (if Rc=1)
187
188 ----------
189
190 \newpage{}
191
192 ## Floating Move From GPR
193
194 ```
195 fmvfg FRT, RB
196 fmvfg. FRT, RB
197 ```
198
199 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form |
200 |-----|------|-------|-------|-------|----|--------|
201 | PO | FRT | 0 | RB | XO | Rc | X-Form |
202
203 ```
204 FRT <- (RB)
205 ```
206
207 move a 64-bit float from a GPR to a FPR, just copying bits of the IEEE 754
208 representation directly. This is equivalent to `std` followed by `lfd`.
209 As `fmvfg` is just copying bits, `FPSCR` is not affected in any way.
210
211 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
212 operations.
213
214 Special Registers altered:
215
216 CR1 (if Rc=1)
217
218 ----------
219
220 \newpage{}
221
222 ## Floating Move From GPR Single
223
224 ```
225 fmvfgs FRT, RB
226 fmvfgs. FRT, RB
227 ```
228
229 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form |
230 |-----|------|-------|-------|-------|----|--------|
231 | PO | FRT | 0 | RB | XO | Rc | X-Form |
232
233 ```
234 FRT <- DOUBLE((RB)[32:63]) # DOUBLE since that's what lfs uses
235 ```
236
237 move a 32-bit float from a GPR to a FPR, just copying bits of the IEEE 754
238 representation directly. This is equivalent to `stw` followed by `lfs`.
239 As `fmvfgs` is just copying bits, `FPSCR` is not affected in any way.
240
241 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
242 operations.
243
244 Special Registers altered:
245
246 CR1 (if Rc=1)
247
248 ----------
249
250 \newpage{}
251
252 ## Floating Convert From Integer In GPR
253
254 ```
255 fcvtfg FRT, RB, IT
256 fcvtfg. FRT, RB, IT
257 ```
258
259 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-30 | 31 | Form |
260 |-----|------|-------|-------|-------|-------|----|--------|
261 | PO | FRT | IT | 0 | RB | XO | Rc | X-Form |
262
263 ```
264 if IT[0] = 0 then # 32-bit int -> 64-bit float
265 # rounding never necessary, so don't touch FPSCR
266 # based off xvcvsxwdp
267 if IT = 0 then # Signed 32-bit
268 src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
269 else # IT = 1 -- Unsigned 32-bit
270 src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
271 FRT <- bfp64_CONVERT_FROM_BFP(src)
272 else
273 # rounding may be necessary. based off xscvuxdsp
274 reset_xflags()
275 switch(IT)
276 case(0): # Signed 32-bit
277 src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
278 case(1): # Unsigned 32-bit
279 src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
280 case(2): # Signed 64-bit
281 src <- bfp_CONVERT_FROM_SI64((RB))
282 default: # Unsigned 64-bit
283 src <- bfp_CONVERT_FROM_UI64((RB))
284 rnd <- bfp_ROUND_TO_BFP64(FPSCR.RN, src)
285 result <- bfp64_CONVERT_FROM_BFP(rnd)
286 cls <- fprf_CLASS_BFP64(result)
287
288 if xx_flag = 1 then SetFX(FPSCR.XX)
289
290 FRT <- result
291 FPSCR.FPRF <- cls
292 FPSCR.FR <- inc_flag
293 FPSCR.FI <- xx_flag
294 ```
295 <!-- note the PowerISA spec. explicitly has empty lines before/after SetFX,
296 don't remove them -->
297
298 Convert from a unsigned/signed 32/64-bit integer in RB to a 64-bit
299 float in FRT.
300
301 If converting from a unsigned/signed 32-bit integer to a 64-bit float,
302 rounding is never necessary, so `FPSCR` is unmodified and exceptions are
303 never raised. Otherwise, `FPSCR` is modified and exceptions are raised
304 as usual.
305
306 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
307 operations.
308
309 Special Registers altered:
310
311 CR1 (if Rc=1)
312 FPCSR (TODO: which bits?) (if IT[0]=1)
313
314 ### Assembly Aliases
315
316 | Assembly Alias | Full Instruction |&nbsp;| Assembly Alias | Full Instruction |
317 |----------------------|----------------------|------|----------------------|----------------------|
318 | `fcvtfgw FRT, RB` | `fcvtfg FRT, RB, 0` |&nbsp;| `fcvtfgd FRT, RB` | `fcvtfg FRT, RB, 2` |
319 | `fcvtfgw. FRT, RB` | `fcvtfg. FRT, RB, 0` |&nbsp;| `fcvtfgd. FRT, RB` | `fcvtfg. FRT, RB, 2` |
320 | `fcvtfguw FRT, RB` | `fcvtfg FRT, RB, 1` |&nbsp;| `fcvtfgud FRT, RB` | `fcvtfg FRT, RB, 3` |
321 | `fcvtfguw. FRT, RB` | `fcvtfg. FRT, RB, 1` |&nbsp;| `fcvtfgud. FRT, RB` | `fcvtfg. FRT, RB, 3` |
322
323 ----------
324
325 \newpage{}
326
327 ## Floating Convert From Integer In GPR Single
328
329 ```
330 fcvtfgs FRT, RB, IT
331 fcvtfgs. FRT, RB, IT
332 ```
333
334 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-30 | 31 | Form |
335 |-----|------|-------|-------|-------|-------|----|--------|
336 | PO | FRT | IT | 0 | RB | XO | Rc | X-Form |
337
338 ```
339 # rounding may be necessary. based off xscvuxdsp
340 reset_xflags()
341 switch(IT)
342 case(0): # Signed 32-bit
343 src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
344 case(1): # Unsigned 32-bit
345 src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
346 case(2): # Signed 64-bit
347 src <- bfp_CONVERT_FROM_SI64((RB))
348 default: # Unsigned 64-bit
349 src <- bfp_CONVERT_FROM_UI64((RB))
350 rnd <- bfp_ROUND_TO_BFP32(FPSCR.RN, src)
351 result32 <- bfp32_CONVERT_FROM_BFP(rnd)
352 cls <- fprf_CLASS_BFP32(result32)
353 result <- DOUBLE(result32)
354
355 if xx_flag = 1 then SetFX(FPSCR.XX)
356
357 FRT <- result
358 FPSCR.FPRF <- cls
359 FPSCR.FR <- inc_flag
360 FPSCR.FI <- xx_flag
361 ```
362 <!-- note the PowerISA spec. explicitly has empty lines before/after SetFX,
363 don't remove them -->
364
365 Convert from a unsigned/signed 32/64-bit integer in RB to a 32-bit
366 float in FRT, following the usual 32-bit float in 64-bit float format.
367 `FPSCR` is modified and exceptions are raised as usual.
368
369 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
370 operations.
371
372 Special Registers altered:
373
374 CR1 (if Rc=1)
375 FPCSR (TODO: which bits?)
376
377 ### Assembly Aliases
378
379 | Assembly Alias | Full Instruction |&nbsp;| Assembly Alias | Full Instruction |
380 |----------------------|----------------------|------|----------------------|----------------------|
381 | `fcvtfgws FRT, RB` | `fcvtfg FRT, RB, 0` |&nbsp;| `fcvtfgds FRT, RB` | `fcvtfg FRT, RB, 2` |
382 | `fcvtfgws. FRT, RB` | `fcvtfg. FRT, RB, 0` |&nbsp;| `fcvtfgds. FRT, RB` | `fcvtfg. FRT, RB, 2` |
383 | `fcvtfguws FRT, RB` | `fcvtfg FRT, RB, 1` |&nbsp;| `fcvtfguds FRT, RB` | `fcvtfg FRT, RB, 3` |
384 | `fcvtfguws. FRT, RB` | `fcvtfg. FRT, RB, 1` |&nbsp;| `fcvtfguds. FRT, RB` | `fcvtfg. FRT, RB, 3` |
385
386 ----------
387
388 \newpage{}
389
390 ## Floating-point to Integer Conversion Overview
391
392 <div id="fpr-to-gpr-conversion-mode"></div>
393
394 IEEE 754 doesn't specify what results are obtained when converting a NaN
395 or out-of-range floating-point value to integer, so different programming
396 languages and ISAs have made different choices. Below is an overview
397 of the different variants, listing the languages and hardware that
398 implements each variant.
399
400 For convenience, we will give those different conversion semantics names
401 based on which common ISA or programming language uses them, since there
402 may not be an established name for them:
403
404 **Standard OpenPower conversion**
405
406 This conversion performs "saturation with NaN converted to minimum
407 valid integer". This is also exactly the same as the x86 ISA conversion
408 semantics. OpenPOWER however has instructions for both:
409
410 * rounding mode read from FPSCR
411 * rounding mode always set to truncate
412
413 **Java/Saturating conversion**
414
415 For the sake of simplicity, the FP -> Integer conversion semantics
416 generalized from those used by Java's semantics (and Rust's `as`
417 operator) will be referred to as [Java/Saturating conversion
418 semantics](#fp-to-int-java-saturating-conversion-semantics).
419
420 Those same semantics are used in some way by all of the following
421 languages (not necessarily for the default conversion method):
422
423 * Java's
424 [FP -> Integer conversion](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
425 (only for long/int results)
426 * Rust's FP -> Integer conversion using the
427 [`as` operator](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
428 * LLVM's
429 [`llvm.fptosi.sat`](https://llvm.org/docs/LangRef.html#llvm-fptosi-sat-intrinsic) and
430 [`llvm.fptoui.sat`](https://llvm.org/docs/LangRef.html#llvm-fptoui-sat-intrinsic) intrinsics
431 * SPIR-V's OpenCL dialect's
432 [`OpConvertFToU`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToU) and
433 [`OpConvertFToS`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToS)
434 instructions when decorated with
435 [the `SaturatedConversion` decorator](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#_a_id_decoration_a_decoration).
436 * WebAssembly has also introduced
437 [trunc_sat_u](ttps://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-u) and
438 [trunc_sat_s](https://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-s)
439
440 **JavaScript conversion**
441
442 For the sake of simplicity, the FP -> Integer conversion
443 semantics generalized from those used by JavaScripts's `ToInt32`
444 abstract operation will be referred to as [JavaScript conversion
445 semantics](#fp-to-int-javascript-conversion-semantics).
446
447 This instruction is present in ARM assembler as FJCVTZS
448 <https://developer.arm.com/documentation/dui0801/g/hko1477562192868>
449
450 **Rc=1 and OE=1**
451
452 All of these instructions have an Rc=1 mode which sets CR0
453 in the normal way for any instructions producing a GPR result.
454 Additionally, when OE=1, if the numerical value of the FP number
455 is not 100% accurately preserved (due to truncation or saturation
456 and including when the FP number was NaN) then this is considered
457 to be an integer Overflow condition, and CR0.SO, XER.SO and XER.OV
458 are all set as normal for any GPR instructions that overflow.
459
460 \newpage{}
461
462 ### FP to Integer Conversion Simplified Pseudo-code
463
464 Key for pseudo-code:
465
466 | term | result type | definition |
467 |---------------------------|-------------|----------------------------------------------------------------------------------------------------|
468 | `fp` | -- | `f32` or `f64` (or other types from SimpleV) |
469 | `int` | -- | `u32`/`u64`/`i32`/`i64` (or other types from SimpleV) |
470 | `uint` | -- | the unsigned integer of the same bit-width as `int` |
471 | `int::BITS` | `int` | the bit-width of `int` |
472 | `uint::MIN_VALUE` | `uint` | the minimum value `uint` can store: `0` |
473 | `uint::MAX_VALUE` | `uint` | the maximum value `uint` can store: `2^int::BITS - 1` |
474 | `int::MIN_VALUE` | `int` | the minimum value `int` can store : `-2^(int::BITS-1)` |
475 | `int::MAX_VALUE` | `int` | the maximum value `int` can store : `2^(int::BITS-1) - 1` |
476 | `int::VALUE_COUNT` | Integer | the number of different values `int` can store (`2^int::BITS`). too big to fit in `int`. |
477 | `rint(fp, rounding_mode)` | `fp` | rounds the floating-point value `fp` to an integer according to rounding mode `rounding_mode` |
478
479 <div id="fp-to-int-openpower-conversion-semantics"></div>
480 OpenPower conversion semantics (section A.2 page 1009 (page 1035) of
481 Power ISA v3.1B):
482
483 ```
484 def fp_to_int_open_power<fp, int>(v: fp) -> int:
485 if v is NaN:
486 return int::MIN_VALUE
487 if v >= int::MAX_VALUE:
488 return int::MAX_VALUE
489 if v <= int::MIN_VALUE:
490 return int::MIN_VALUE
491 return (int)rint(v, rounding_mode)
492 ```
493
494 <div id="fp-to-int-java-saturating-conversion-semantics"></div>
495 [Java/Saturating conversion semantics](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
496 (only for long/int results)/
497 [Rust semantics](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
498 (with adjustment to add non-truncate rounding modes):
499
500 ```
501 def fp_to_int_java_saturating<fp, int>(v: fp) -> int:
502 if v is NaN:
503 return 0
504 if v >= int::MAX_VALUE:
505 return int::MAX_VALUE
506 if v <= int::MIN_VALUE:
507 return int::MIN_VALUE
508 return (int)rint(v, rounding_mode)
509 ```
510
511 <div id="fp-to-int-javascript-conversion-semantics"></div>
512 Section 7.1 of the ECMAScript / JavaScript
513 [conversion semantics](https://262.ecma-international.org/11.0/#sec-toint32)
514 (with adjustment to add non-truncate rounding modes):
515
516 ```
517 def fp_to_int_java_script<fp, int>(v: fp) -> int:
518 if v is NaN or infinite:
519 return 0
520 v = rint(v, rounding_mode) # assume no loss of precision in result
521 v = v mod int::VALUE_COUNT # 2^32 for i32, 2^64 for i64, result is non-negative
522 bits = (uint)v
523 return (int)bits
524 ```
525
526
527 ----------
528
529 \newpage{}
530
531
532 ## Floating Convert To Integer In GPR
533
534 ```
535 fcvttg RT, FRB, CVM, IT
536 fcvttg. RT, FRB, CVM, IT
537 fcvttgo RT, FRB, CVM, IT
538 fcvttgo. RT, FRB, CVM, IT
539 ```
540
541 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-29 | 30 | 31 | Form |
542 |-----|------|-------|-------|-------|-------|----|----|---------|
543 | PO | RT | IT | CVM | FRB | XO | OE | Rc | XO-Form |
544
545 ```
546 # based on xscvdpuxws
547 reset_xflags()
548 src <- bfp_CONVERT_FROM_BFP64((FRB))
549
550 switch(IT)
551 case(0): # Signed 32-bit
552 range_min <- bfp_CONVERT_FROM_SI32(0x8000_0000)
553 range_max <- bfp_CONVERT_FROM_SI32(0x7FFF_FFFF)
554 js_mask <- 0xFFFF_FFFF
555 case(1): # Unsigned 32-bit
556 range_min <- bfp_CONVERT_FROM_UI32(0)
557 range_max <- bfp_CONVERT_FROM_UI32(0xFFFF_FFFF)
558 js_mask <- 0xFFFF_FFFF
559 case(2): # Signed 64-bit
560 range_min <- bfp_CONVERT_FROM_SI64(-0x8000_0000_0000_0000)
561 range_max <- bfp_CONVERT_FROM_SI64(0x7FFF_FFFF_FFFF_FFFF)
562 js_mask <- 0xFFFF_FFFF_FFFF_FFFF
563 default: # Unsigned 64-bit
564 range_min <- bfp_CONVERT_FROM_UI64(0)
565 range_max <- bfp_CONVERT_FROM_UI64(0xFFFF_FFFF_FFFF_FFFF)
566 js_mask <- 0xFFFF_FFFF_FFFF_FFFF
567
568 if CVM[2] = 1 or FPSCR.RN = 0b01 then
569 rnd <- bfp_ROUND_TO_INTEGER_TRUNC(src)
570 else if FPSCR.RN = 0b00 then
571 rnd <- bfp_ROUND_TO_INTEGER_NEAR_EVEN(src)
572 else if FPSCR.RN = 0b10 then
573 rnd <- bfp_ROUND_TO_INTEGER_CEIL(src)
574 else if FPSCR.RN = 0b11 then
575 rnd <- bfp_ROUND_TO_INTEGER_FLOOR(src)
576
577 switch(CVM)
578 case(0, 1): # OpenPower semantics
579 if IsNaN(rnd) then
580 result <- si64_CONVERT_FROM_BFP(range_min)
581 else if bfp_COMPARE_GT(rnd, range_max) then
582 result <- ui64_CONVERT_FROM_BFP(range_max)
583 else if bfp_COMPARE_LT(rnd, range_min) then
584 result <- si64_CONVERT_FROM_BFP(range_min)
585 else if IT[1] = 1 then # Unsigned 32/64-bit
586 result <- ui64_CONVERT_FROM_BFP(range_max)
587 else # Signed 32/64-bit
588 result <- si64_CONVERT_FROM_BFP(range_max)
589 case(2, 3): # Java/Saturating semantics
590 if IsNaN(rnd) then
591 result <- [0] * 64
592 else if bfp_COMPARE_GT(rnd, range_max) then
593 result <- ui64_CONVERT_FROM_BFP(range_max)
594 else if bfp_COMPARE_LT(rnd, range_min) then
595 result <- si64_CONVERT_FROM_BFP(range_min)
596 else if IT[1] = 1 then # Unsigned 32/64-bit
597 result <- ui64_CONVERT_FROM_BFP(range_max)
598 else # Signed 32/64-bit
599 result <- si64_CONVERT_FROM_BFP(range_max)
600 default: # JavaScript semantics
601 # CVM = 6, 7 are illegal instructions
602 # this works because the largest type we try to convert from has
603 # 53 significand bits, and the largest type we try to convert to
604 # has 64 bits, and the sum of those is strictly less than the 128
605 # bits of the intermediate result.
606 limit <- bfp_CONVERT_FROM_UI128([1] * 128)
607 if IsInf(rnd) or IsNaN(rnd) then
608 result <- [0] * 64
609 else if bfp_COMPARE_GT(bfp_ABSOLUTE(rnd), limit) then
610 result <- [0] * 64
611 else
612 result128 <- si128_CONVERT_FROM_BFP(rnd)
613 result <- result128[64:127] & js_mask
614
615 switch(IT)
616 case(0): # Signed 32-bit
617 result <- EXTS64(result[32:63])
618 result_bfp <- bfp_CONVERT_FROM_SI32(result[32:63])
619 case(1): # Unsigned 32-bit
620 result <- EXTZ64(result[32:63])
621 result_bfp <- bfp_CONVERT_FROM_UI32(result[32:63])
622 case(2): # Signed 64-bit
623 result_bfp <- bfp_CONVERT_FROM_SI64(result)
624 default: # Unsigned 64-bit
625 result_bfp <- bfp_CONVERT_FROM_UI64(result)
626
627 if vxsnan_flag = 1 then SetFX(FPSCR.VXSNAN)
628 if vxcvi_flag = 1 then SetFX(FPSCR.VXCVI)
629 if xx_flag = 1 then SetFX(FPSCR.XX)
630
631 vx_flag <- vxsnan_flag | vxcvi_flag
632 vex_flag <- FPSCR.VE & vx_flag
633
634 if vex_flag = 0 then
635 RT <- result
636 FPSCR.FPRF <- undefined
637 FPSCR.FR <- inc_flag
638 FPSCR.FI <- xx_flag
639 if IsNaN(src) or not bfp_COMPARE_EQ(src, result_bfp) then
640 overflow <- 1 # signals SO only when OE = 1
641 else
642 FPSCR.FR <- 0
643 FPSCR.FI <- 0
644 ```
645
646 Convert from 64-bit float in FRB to a unsigned/signed 32/64-bit integer
647 in RT, with the conversion overflow/rounding semantics following the
648 chosen `CVM` value. `FPSCR` is modified and exceptions are raised as usual.
649
650 These instructions have an Rc=1 mode which sets CR0 in the normal
651 way for any instructions producing a GPR result. Additionally, when OE=1,
652 if the numerical value of the FP number is not 100% accurately preserved
653 (due to truncation or saturation and including when the FP number was
654 NaN) then this is considered to be an Integer Overflow condition, and
655 CR0.SO, XER.SO and XER.OV are all set as normal for any GPR instructions
656 that overflow.
657
658 Special Registers altered:
659
660 CR0 (if Rc=1)
661 XER SO, OV, OV32 (if OE=1)
662 FPCSR (TODO: which bits?)
663
664 ### Assembly Aliases
665
666 | Assembly Alias | Full Instruction | Assembly Alias | Full Instruction |
667 |---------------------------|----------------------------|---------------------------|----------------------------|
668 | `fcvttgw RT, FRB, CVM` | `fcvttg RT, FRB, CVM, 0` | `fcvttgd RT, FRB, CVM` | `fcvttg RT, FRB, CVM, 2` |
669 | `fcvttgw. RT, FRB, CVM` | `fcvttg. RT, FRB, CVM, 0` | `fcvttgd. RT, FRB, CVM` | `fcvttg. RT, FRB, CVM, 2` |
670 | `fcvttgwo RT, FRB, CVM` | `fcvttgo RT, FRB, CVM, 0` | `fcvttgdo RT, FRB, CVM` | `fcvttgo RT, FRB, CVM, 2` |
671 | `fcvttgwo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 0` | `fcvttgdo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 2` |
672 | `fcvttguw RT, FRB, CVM` | `fcvttg RT, FRB, CVM, 1` | `fcvttgud RT, FRB, CVM` | `fcvttg RT, FRB, CVM, 3` |
673 | `fcvttguw. RT, FRB, CVM` | `fcvttg. RT, FRB, CVM, 1` | `fcvttgud. RT, FRB, CVM` | `fcvttg. RT, FRB, CVM, 3` |
674 | `fcvttguwo RT, FRB, CVM` | `fcvttgo RT, FRB, CVM, 1` | `fcvttgudo RT, FRB, CVM` | `fcvttgo RT, FRB, CVM, 3` |
675 | `fcvttguwo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 1` | `fcvttgudo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 3` |
676
677 ----------
678
679 \newpage{}
680
681 ## Floating Convert Single To Integer In GPR
682
683 ```
684 fcvtstg RT, FRB, CVM, IT
685 fcvtstg. RT, FRB, CVM, IT
686 fcvtstgo RT, FRB, CVM, IT
687 fcvtstgo. RT, FRB, CVM, IT
688 ```
689
690 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-29 | 30 | 31 | Form |
691 |-----|------|-------|-------|-------|-------|----|----|---------|
692 | PO | RT | IT | CVM | FRB | XO | OE | Rc | XO-Form |
693
694 ```
695 # based on xscvdpuxws
696 reset_xflags()
697 src <- bfp_CONVERT_FROM_BFP32(SINGLE((FRB)))
698
699 switch(IT)
700 case(0): # Signed 32-bit
701 range_min <- bfp_CONVERT_FROM_SI32(0x8000_0000)
702 range_max <- bfp_CONVERT_FROM_SI32(0x7FFF_FFFF)
703 js_mask <- 0xFFFF_FFFF
704 case(1): # Unsigned 32-bit
705 range_min <- bfp_CONVERT_FROM_UI32(0)
706 range_max <- bfp_CONVERT_FROM_UI32(0xFFFF_FFFF)
707 js_mask <- 0xFFFF_FFFF
708 case(2): # Signed 64-bit
709 range_min <- bfp_CONVERT_FROM_SI64(-0x8000_0000_0000_0000)
710 range_max <- bfp_CONVERT_FROM_SI64(0x7FFF_FFFF_FFFF_FFFF)
711 js_mask <- 0xFFFF_FFFF_FFFF_FFFF
712 default: # Unsigned 64-bit
713 range_min <- bfp_CONVERT_FROM_UI64(0)
714 range_max <- bfp_CONVERT_FROM_UI64(0xFFFF_FFFF_FFFF_FFFF)
715 js_mask <- 0xFFFF_FFFF_FFFF_FFFF
716
717 if CVM[2] = 1 or FPSCR.RN = 0b01 then
718 rnd <- bfp_ROUND_TO_INTEGER_TRUNC(src)
719 else if FPSCR.RN = 0b00 then
720 rnd <- bfp_ROUND_TO_INTEGER_NEAR_EVEN(src)
721 else if FPSCR.RN = 0b10 then
722 rnd <- bfp_ROUND_TO_INTEGER_CEIL(src)
723 else if FPSCR.RN = 0b11 then
724 rnd <- bfp_ROUND_TO_INTEGER_FLOOR(src)
725
726 switch(CVM)
727 case(0, 1): # OpenPower semantics
728 if IsNaN(rnd) then
729 result <- si64_CONVERT_FROM_BFP(range_min)
730 else if bfp_COMPARE_GT(rnd, range_max) then
731 result <- ui64_CONVERT_FROM_BFP(range_max)
732 else if bfp_COMPARE_LT(rnd, range_min) then
733 result <- si64_CONVERT_FROM_BFP(range_min)
734 else if IT[1] = 1 then # Unsigned 32/64-bit
735 result <- ui64_CONVERT_FROM_BFP(range_max)
736 else # Signed 32/64-bit
737 result <- si64_CONVERT_FROM_BFP(range_max)
738 case(2, 3): # Java/Saturating semantics
739 if IsNaN(rnd) then
740 result <- [0] * 64
741 else if bfp_COMPARE_GT(rnd, range_max) then
742 result <- ui64_CONVERT_FROM_BFP(range_max)
743 else if bfp_COMPARE_LT(rnd, range_min) then
744 result <- si64_CONVERT_FROM_BFP(range_min)
745 else if IT[1] = 1 then # Unsigned 32/64-bit
746 result <- ui64_CONVERT_FROM_BFP(range_max)
747 else # Signed 32/64-bit
748 result <- si64_CONVERT_FROM_BFP(range_max)
749 default: # JavaScript semantics
750 # CVM = 6, 7 are illegal instructions
751 # this works because the largest type we try to convert from has
752 # 53 significand bits, and the largest type we try to convert to
753 # has 64 bits, and the sum of those is strictly less than the 128
754 # bits of the intermediate result.
755 limit <- bfp_CONVERT_FROM_UI128([1] * 128)
756 if IsInf(rnd) or IsNaN(rnd) then
757 result <- [0] * 64
758 else if bfp_COMPARE_GT(bfp_ABSOLUTE(rnd), limit) then
759 result <- [0] * 64
760 else
761 result128 <- si128_CONVERT_FROM_BFP(rnd)
762 result <- result128[64:127] & js_mask
763
764 switch(IT)
765 case(0): # Signed 32-bit
766 result <- EXTS64(result[32:63])
767 result_bfp <- bfp_CONVERT_FROM_SI32(result[32:63])
768 case(1): # Unsigned 32-bit
769 result <- EXTZ64(result[32:63])
770 result_bfp <- bfp_CONVERT_FROM_UI32(result[32:63])
771 case(2): # Signed 64-bit
772 result_bfp <- bfp_CONVERT_FROM_SI64(result)
773 default: # Unsigned 64-bit
774 result_bfp <- bfp_CONVERT_FROM_UI64(result)
775
776 if vxsnan_flag = 1 then SetFX(FPSCR.VXSNAN)
777 if vxcvi_flag = 1 then SetFX(FPSCR.VXCVI)
778 if xx_flag = 1 then SetFX(FPSCR.XX)
779
780 vx_flag <- vxsnan_flag | vxcvi_flag
781 vex_flag <- FPSCR.VE & vx_flag
782
783 if vex_flag = 0 then
784 RT <- result
785 FPSCR.FPRF <- undefined
786 FPSCR.FR <- inc_flag
787 FPSCR.FI <- xx_flag
788 if IsNaN(src) or not bfp_COMPARE_EQ(src, result_bfp) then
789 overflow <- 1 # signals SO only when OE = 1
790 else
791 FPSCR.FR <- 0
792 FPSCR.FI <- 0
793 ```
794
795 Convert from 32-bit float in FRB to a unsigned/signed 32/64-bit integer
796 in RT, with the conversion overflow/rounding semantics following the
797 chosen `CVM` value, following the usual 32-bit float in 64-bit float
798 format. `FPSCR` is modified and exceptions are raised as usual.
799
800 These instructions have an Rc=1 mode which sets CR0 in the normal
801 way for any instructions producing a GPR result. Additionally, when OE=1,
802 if the numerical value of the FP number is not 100% accurately preserved
803 (due to truncation or saturation and including when the FP number was
804 NaN) then this is considered to be an Integer Overflow condition, and
805 CR0.SO, XER.SO and XER.OV are all set as normal for any GPR instructions
806 that overflow.
807
808 Special Registers altered:
809
810 CR0 (if Rc=1)
811 XER SO, OV, OV32 (if OE=1)
812 FPCSR (TODO: which bits?)
813
814 ### Assembly Aliases
815
816 | Assembly Alias | Full Instruction | Assembly Alias | Full Instruction |
817 |----------------------------|-----------------------------|----------------------------|-----------------------------|
818 | `fcvtstgw RT, FRB, CVM` | `fcvtstg RT, FRB, CVM, 0` | `fcvtstgd RT, FRB, CVM` | `fcvtstg RT, FRB, CVM, 2` |
819 | `fcvtstgw. RT, FRB, CVM` | `fcvtstg. RT, FRB, CVM, 0` | `fcvtstgd. RT, FRB, CVM` | `fcvtstg. RT, FRB, CVM, 2` |
820 | `fcvtstgwo RT, FRB, CVM` | `fcvtstgo RT, FRB, CVM, 0` | `fcvtstgdo RT, FRB, CVM` | `fcvtstgo RT, FRB, CVM, 2` |
821 | `fcvtstgwo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 0` | `fcvtstgdo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 2` |
822 | `fcvtstguw RT, FRB, CVM` | `fcvtstg RT, FRB, CVM, 1` | `fcvtstgud RT, FRB, CVM` | `fcvtstg RT, FRB, CVM, 3` |
823 | `fcvtstguw. RT, FRB, CVM` | `fcvtstg. RT, FRB, CVM, 1` | `fcvtstgud. RT, FRB, CVM` | `fcvtstg. RT, FRB, CVM, 3` |
824 | `fcvtstguwo RT, FRB, CVM` | `fcvtstgo RT, FRB, CVM, 1` | `fcvtstgudo RT, FRB, CVM` | `fcvtstgo RT, FRB, CVM, 3` |
825 | `fcvtstguwo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 1` | `fcvtstgudo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 3` |
826
827 ----------
828
829 \newpage{}
830
831 ----------
832
833 # Appendices
834
835 Appendix E Power ISA sorted by opcode
836 Appendix F Power ISA sorted by version
837 Appendix G Power ISA sorted by Compliancy Subset
838 Appendix H Power ISA sorted by mnemonic
839
840 |Form| Book | Page | Version | mnemonic | Description |
841 |----|------|------|---------|----------|-------------|
842 |VA | I | # | 3.2B |todo | |
843
844 ----------------
845
846 [[!tag opf_rfc]]