removing rust. to be kept to the absolute bare minimum: a single
[libreriscv.git] / openpower / sv / rfc / ls006.mdwn
1 # RFC ls006 FPR <-> GPR Move/Conversion
2
3 **URLs**:
4
5 * <https://libre-soc.org/openpower/sv/int_fp_mv/>
6 * <https://libre-soc.org/openpower/sv/rfc/ls006/>
7 * <https://bugs.libre-soc.org/show_bug.cgi?id=1015>
8 * <https://git.openpower.foundation/isa/PowerISA/issues/todo>
9
10 **Severity**: Major
11
12 **Status**: New
13
14 **Date**: 20 Oct 2022
15
16 **Target**: v3.2B
17
18 **Source**: v3.1B
19
20 **Books and Section affected**: **UPDATE**
21
22 * Book I 4.6.5 Floating-Point Move Instructions
23 * Book I 4.6.7.2 Floating-Point Convert To/From Integer Instructions
24 * Appendix E Power ISA sorted by opcode
25 * Appendix F Power ISA sorted by version
26 * Appendix G Power ISA sorted by Compliancy Subset
27 * Appendix H Power ISA sorted by mnemonic
28
29 **Summary**
30
31 Single-precision Instructions added:
32
33 * `fmvtgs` -- Single-Precision Floating Move To GPR
34 * `fmvfgs` -- Single-Precision Floating Move From GPR
35 * `fcvttgs` -- Single-Precision Floating Convert To Integer In GPR
36 * `fcvtfgs` -- Single-Precision Floating Convert From Integer In GPR
37
38 Identical (except Double-precision) Instructions added:
39
40 * `fmvtg` -- Double-Precision Floating Move To GPR
41 * `fmvfg` -- Double-Precision Floating Move From GPR
42 * `fcvttg` -- Double-Precision Floating Convert To Integer In GPR
43 * `fcvtfg` -- Double-Precision Floating Convert From Integer In GPR
44
45 **Submitter**: Luke Leighton (Libre-SOC)
46
47 **Requester**: Libre-SOC
48
49 **Impact on processor**:
50
51 * Addition of four new Single-Precision GPR-FPR-based instructions
52 * Addition of four new Double-Precision GPR-FPR-based instructions
53
54 **Impact on software**:
55
56 * Requires support for new instructions in assembler, debuggers,
57 and related tools.
58
59 **Keywords**:
60
61 ```
62 GPR, FPR, Move, Conversion, JavaScript
63 ```
64
65 **Motivation**
66
67 CPUs without VSX/VMX lack a way to efficiently transfer data between
68 FPRs and GPRs, they need to go through memory, this proposal adds more
69 efficient data transfer (both bitwise copy and Integer <-> FP conversion)
70 instructions that transfer directly between FPRs and GPRs without needing
71 to go through memory.
72
73 IEEE 754 doesn't specify what results are obtained when converting a NaN
74 or out-of-range floating-point value to integer, so different programming
75 languages and ISAs have made different choices. Below is an overview
76 of the different variants, listing the languages and hardware that
77 implements each variant.
78
79 **Notes and Observations**:
80
81 * These instructions are present in many other ISAs.
82 * JavaScript rounding as one instruction saves 35 instructions including
83 six branches. (FIXME: disagrees with int_fp_mv and int_fp_mv/appendix)
84 * Both sets are orthogonal (no difference except being Single/Double).
85 This allows IBM to follow the pre-existing precedent of allocating
86 separate Major Opcodes (PO) for Double-precision and Single-precision
87 respectively.
88
89 **Changes**
90
91 Add the following entries to:
92
93 * Book I 4.6.5 Floating-Point Move Instructions
94 * Book I 4.6.7.2 Floating-Point Convert To/From Integer Instructions
95 * Book I 1.6.1 and 1.6.2
96
97 ----------------
98
99 \newpage{}
100
101 # Immediate Tables
102
103 Tables that are used by
104 `fmvtg[s][.]`/`fmvfg[s][.]`/`fcvt[s]tg[o][.]`/`fcvtfg[s][.]`:
105
106 ## `IT` -- Integer Type
107
108 | `IT` | Integer Type | Assembly Alias Mnemonic |
109 |------|-----------------|-------------------------|
110 | 0 | Signed 32-bit | `<op>w` |
111 | 1 | Unsigned 32-bit | `<op>uw` |
112 | 2 | Signed 64-bit | `<op>d` |
113 | 3 | Unsigned 64-bit | `<op>ud` |
114
115 ## `CVM` -- Float to Integer Conversion Mode
116
117 | `CVM` | `rounding_mode` | Semantics |
118 |-------|-----------------|----------------------------------|
119 | 000 | from `FPSCR` | [OpenPower semantics] |
120 | 001 | Truncate | [OpenPower semantics] |
121 | 010 | from `FPSCR` | [Java/Saturating semantics] |
122 | 011 | Truncate | [Java/Saturating semantics] |
123 | 100 | from `FPSCR` | [JavaScript semantics] |
124 | 101 | Truncate | [JavaScript semantics] |
125 | rest | -- | illegal instruction trap for now |
126
127 [OpenPower semantics]: #fp-to-int-openpower-conversion-semantics
128 [Java/Saturating semantics]: #fp-to-int-java-saturating-conversion-semantics
129 [JavaScript semantics]: #fp-to-int-javascript-conversion-semantics
130
131 ----------
132
133 ## Floating Move To GPR
134
135 ```
136 fmvtg RT, FRB
137 fmvtg. RT, FRB
138 ```
139
140 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form |
141 |-----|------|-------|-------|-------|----|--------|
142 | PO | RT | 0 | FRB | XO | Rc | X-Form |
143
144 ```
145 RT <- (FRB)
146 ```
147
148 Move a 64-bit float from a FPR to a GPR, just copying bits of the IEEE 754
149 representation directly. This is equivalent to `stfd` followed by `ld`.
150 As `fmvtg` is just copying bits, `FPSCR` is not affected in any way.
151
152 Rc=1 tests RT and sets CR0, exactly like all other Scalar Fixed-Point
153 operations.
154
155 Special Registers altered:
156
157 ```
158 CR0 (if Rc=1)
159 ```
160
161 ----------
162
163 ## Floating Move To GPR Single
164
165 ```
166 fmvtgs RT, FRB
167 fmvtgs. RT, FRB
168 ```
169
170 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form |
171 |-----|------|-------|-------|-------|----|--------|
172 | PO | RT | 0 | FRB | XO | Rc | X-Form |
173
174 ```
175 RT <- [0] * 32 || SINGLE((FRB)) # SINGLE since that's what stfs uses
176 ```
177
178 Move a 32-bit float from a FPR to a GPR, just copying bits of the IEEE 754
179 representation directly. This is equivalent to `stfs` followed by `lwz`.
180 As `fmvtgs` is just copying bits, `FPSCR` is not affected in any way.
181
182 Rc=1 tests RT and sets CR0, exactly like all other Scalar Fixed-Point
183 operations.
184
185 Special Registers altered:
186
187 ```
188 CR0 (if Rc=1)
189 ```
190
191 ----------
192
193 \newpage{}
194
195 ## Double-Precision Floating Move From GPR
196
197 ```
198 fmvfg FRT, RB
199 fmvfg. FRT, RB
200 ```
201
202 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form |
203 |-----|------|-------|-------|-------|----|--------|
204 | PO | FRT | 0 | RB | XO | Rc | X-Form |
205
206 ```
207 FRT <- (RB)
208 ```
209
210 move a 64-bit float from a GPR to a FPR, just copying bits of the IEEE 754
211 representation directly. This is equivalent to `std` followed by `lfd`.
212 As `fmvfg` is just copying bits, `FPSCR` is not affected in any way.
213
214 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
215 operations.
216
217 Special Registers altered:
218
219 ```
220 CR1 (if Rc=1)
221 ```
222
223 ----------
224
225 ## Floating Move From GPR Single
226
227 ```
228 fmvfgs FRT, RB
229 fmvfgs. FRT, RB
230 ```
231
232 | 0-5 | 6-10 | 11-15 | 16-20 | 21-30 | 31 | Form |
233 |-----|------|-------|-------|-------|----|--------|
234 | PO | FRT | 0 | RB | XO | Rc | X-Form |
235
236 ```
237 FRT <- DOUBLE((RB)[32:63]) # DOUBLE since that's what lfs uses
238 ```
239
240 move a 32-bit float from a GPR to a FPR, just copying bits of the IEEE 754
241 representation directly. This is equivalent to `stw` followed by `lfs`.
242 As `fmvfgs` is just copying bits, `FPSCR` is not affected in any way.
243
244 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
245 operations.
246
247 Special Registers altered:
248
249 ```
250 CR1 (if Rc=1)
251 ```
252
253 ----------
254
255 \newpage{}
256
257 ## Double-Precision Floating Convert From Integer In GPR
258
259 ```
260 fcvtfg FRT, RB, IT
261 fcvtfg. FRT, RB, IT
262 ```
263
264 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-30 | 31 | Form |
265 |-----|------|-------|-------|-------|-------|----|--------|
266 | PO | FRT | IT | 0 | RB | XO | Rc | X-Form |
267
268 ```
269 if IT[0] = 0 then # 32-bit int -> 64-bit float
270 # rounding never necessary, so don't touch FPSCR
271 # based off xvcvsxwdp
272 if IT = 0 then # Signed 32-bit
273 src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
274 else # IT = 1 -- Unsigned 32-bit
275 src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
276 FRT <- bfp64_CONVERT_FROM_BFP(src)
277 else
278 # rounding may be necessary. based off xscvuxdsp
279 reset_xflags()
280 switch(IT)
281 case(0): # Signed 32-bit
282 src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
283 case(1): # Unsigned 32-bit
284 src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
285 case(2): # Signed 64-bit
286 src <- bfp_CONVERT_FROM_SI64((RB))
287 default: # Unsigned 64-bit
288 src <- bfp_CONVERT_FROM_UI64((RB))
289 rnd <- bfp_ROUND_TO_BFP64(FPSCR.RN, src)
290 result <- bfp64_CONVERT_FROM_BFP(rnd)
291 cls <- fprf_CLASS_BFP64(result)
292
293 if xx_flag = 1 then SetFX(FPSCR.XX)
294
295 FRT <- result
296 FPSCR.FPRF <- cls
297 FPSCR.FR <- inc_flag
298 FPSCR.FI <- xx_flag
299 ```
300 <!-- note the PowerISA spec. explicitly has empty lines before/after SetFX,
301 don't remove them -->
302
303 Convert from a unsigned/signed 32/64-bit integer in RB to a 64-bit
304 float in FRT.
305
306 If converting from a unsigned/signed 32-bit integer to a 64-bit float,
307 rounding is never necessary, so `FPSCR` is unmodified and exceptions are
308 never raised. Otherwise, `FPSCR` is modified and exceptions are raised
309 as usual.
310
311 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
312 operations.
313
314 Special Registers altered:
315
316 ```
317 CR1 (if Rc=1)
318 FPCSR (TODO: which bits?) (if IT[0]=1)
319 ```
320
321 ### Assembly Aliases
322
323 | Assembly Alias | Full Instruction |&nbsp;| Assembly Alias | Full Instruction |
324 |----------------------|----------------------|------|----------------------|----------------------|
325 | `fcvtfgw FRT, RB` | `fcvtfg FRT, RB, 0` |&nbsp;| `fcvtfgd FRT, RB` | `fcvtfg FRT, RB, 2` |
326 | `fcvtfgw. FRT, RB` | `fcvtfg. FRT, RB, 0` |&nbsp;| `fcvtfgd. FRT, RB` | `fcvtfg. FRT, RB, 2` |
327 | `fcvtfguw FRT, RB` | `fcvtfg FRT, RB, 1` |&nbsp;| `fcvtfgud FRT, RB` | `fcvtfg FRT, RB, 3` |
328 | `fcvtfguw. FRT, RB` | `fcvtfg. FRT, RB, 1` |&nbsp;| `fcvtfgud. FRT, RB` | `fcvtfg. FRT, RB, 3` |
329
330 ----------
331
332 \newpage{}
333
334 ## Floating Convert From Integer In GPR Single
335
336 ```
337 fcvtfgs FRT, RB, IT
338 fcvtfgs. FRT, RB, IT
339 ```
340
341 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-30 | 31 | Form |
342 |-----|------|-------|-------|-------|-------|----|--------|
343 | PO | FRT | IT | 0 | RB | XO | Rc | X-Form |
344
345 ```
346 # rounding may be necessary. based off xscvuxdsp
347 reset_xflags()
348 switch(IT)
349 case(0): # Signed 32-bit
350 src <- bfp_CONVERT_FROM_SI32((RB)[32:63])
351 case(1): # Unsigned 32-bit
352 src <- bfp_CONVERT_FROM_UI32((RB)[32:63])
353 case(2): # Signed 64-bit
354 src <- bfp_CONVERT_FROM_SI64((RB))
355 default: # Unsigned 64-bit
356 src <- bfp_CONVERT_FROM_UI64((RB))
357 rnd <- bfp_ROUND_TO_BFP32(FPSCR.RN, src)
358 result32 <- bfp32_CONVERT_FROM_BFP(rnd)
359 cls <- fprf_CLASS_BFP32(result32)
360 result <- DOUBLE(result32)
361
362 if xx_flag = 1 then SetFX(FPSCR.XX)
363
364 FRT <- result
365 FPSCR.FPRF <- cls
366 FPSCR.FR <- inc_flag
367 FPSCR.FI <- xx_flag
368 ```
369 <!-- note the PowerISA spec. explicitly has empty lines before/after SetFX,
370 don't remove them -->
371
372 Convert from a unsigned/signed 32/64-bit integer in RB to a 32-bit
373 float in FRT, following the usual 32-bit float in 64-bit float format.
374 `FPSCR` is modified and exceptions are raised as usual.
375
376 Rc=1 tests FRT and sets CR1, exactly like all other Scalar Floating-Point
377 operations.
378
379 Special Registers altered:
380
381 ```
382 CR1 (if Rc=1)
383 FPCSR (TODO: which bits?)
384 ```
385
386 ### Assembly Aliases
387
388 | Assembly Alias | Full Instruction |&nbsp;| Assembly Alias | Full Instruction |
389 |----------------------|----------------------|------|----------------------|----------------------|
390 | `fcvtfgws FRT, RB` | `fcvtfg FRT, RB, 0` |&nbsp;| `fcvtfgds FRT, RB` | `fcvtfg FRT, RB, 2` |
391 | `fcvtfgws. FRT, RB` | `fcvtfg. FRT, RB, 0` |&nbsp;| `fcvtfgds. FRT, RB` | `fcvtfg. FRT, RB, 2` |
392 | `fcvtfguws FRT, RB` | `fcvtfg FRT, RB, 1` |&nbsp;| `fcvtfguds FRT, RB` | `fcvtfg FRT, RB, 3` |
393 | `fcvtfguws. FRT, RB` | `fcvtfg. FRT, RB, 1` |&nbsp;| `fcvtfguds. FRT, RB` | `fcvtfg. FRT, RB, 3` |
394
395 ----------
396
397 \newpage{}
398
399 ## Floating-point to Integer Conversion Overview
400
401 <div id="fpr-to-gpr-conversion-mode"></div>
402
403 IEEE 754 doesn't specify what results are obtained when converting a NaN
404 or out-of-range floating-point value to integer, so different programming
405 languages and ISAs have made different choices. Below is an overview
406 of the different variants, listing the languages and hardware that
407 implements each variant.
408
409 For convenience, we will give those different conversion semantics names
410 based on which common ISA or programming language uses them, since there
411 may not be an established name for them:
412
413 **Standard OpenPower conversion**
414
415 This conversion performs "saturation with NaN converted to minimum
416 valid integer". This is also exactly the same as the x86 ISA conversion
417 semantics. OpenPOWER however has instructions for both:
418
419 * rounding mode read from FPSCR
420 * rounding mode always set to truncate
421
422 **Java/Saturating conversion**
423
424 For the sake of simplicity, the FP -> Integer conversion semantics
425 generalized from those used by Java's semantics (and Rust's `as`
426 operator) will be referred to as [Java/Saturating conversion
427 semantics](#fp-to-int-java-saturating-conversion-semantics).
428
429 Those same semantics are used in some way by all of the following
430 languages (not necessarily for the default conversion method):
431
432 * Java's
433 [FP -> Integer conversion](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
434 (only for long/int results)
435 * Rust's FP -> Integer conversion using the
436 [`as` operator](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
437 * LLVM's
438 [`llvm.fptosi.sat`](https://llvm.org/docs/LangRef.html#llvm-fptosi-sat-intrinsic) and
439 [`llvm.fptoui.sat`](https://llvm.org/docs/LangRef.html#llvm-fptoui-sat-intrinsic) intrinsics
440 * SPIR-V's OpenCL dialect's
441 [`OpConvertFToU`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToU) and
442 [`OpConvertFToS`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToS)
443 instructions when decorated with
444 [the `SaturatedConversion` decorator](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#_a_id_decoration_a_decoration).
445 * WebAssembly has also introduced
446 [trunc_sat_u](ttps://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-u) and
447 [trunc_sat_s](https://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-s)
448
449 **JavaScript conversion**
450
451 For the sake of simplicity, the FP -> Integer conversion
452 semantics generalized from those used by JavaScripts's `ToInt32`
453 abstract operation will be referred to as [JavaScript conversion
454 semantics](#fp-to-int-javascript-conversion-semantics).
455
456 This instruction is present in ARM assembler as FJCVTZS
457 <https://developer.arm.com/documentation/dui0801/g/hko1477562192868>
458
459 **Rc=1 and OE=1**
460
461 All of these instructions have an Rc=1 mode which sets CR0
462 in the normal way for any instructions producing a GPR result.
463 Additionally, when OE=1, if the numerical value of the FP number
464 is not 100% accurately preserved (due to truncation or saturation
465 and including when the FP number was NaN) then this is considered
466 to be an integer Overflow condition, and CR0.SO, XER.SO and XER.OV
467 are all set as normal for any GPR instructions that overflow.
468
469 \newpage{}
470
471 ### FP to Integer Conversion Simplified Pseudo-code
472
473 Key for pseudo-code:
474
475 | term | result type | definition |
476 |---------------------------|-------------|----------------------------------------------------------------------------------------------------|
477 | `fp` | -- | `f32` or `f64` (or other types from SimpleV) |
478 | `int` | -- | `u32`/`u64`/`i32`/`i64` (or other types from SimpleV) |
479 | `uint` | -- | the unsigned integer of the same bit-width as `int` |
480 | `int::BITS` | `int` | the bit-width of `int` |
481 | `uint::MIN_VALUE` | `uint` | the minimum value `uint` can store: `0` |
482 | `uint::MAX_VALUE` | `uint` | the maximum value `uint` can store: `2^int::BITS - 1` |
483 | `int::MIN_VALUE` | `int` | the minimum value `int` can store : `-2^(int::BITS-1)` |
484 | `int::MAX_VALUE` | `int` | the maximum value `int` can store : `2^(int::BITS-1) - 1` |
485 | `int::VALUE_COUNT` | Integer | the number of different values `int` can store (`2^int::BITS`). too big to fit in `int`. |
486 | `rint(fp, rounding_mode)` | `fp` | rounds the floating-point value `fp` to an integer according to rounding mode `rounding_mode` |
487
488 <div id="fp-to-int-openpower-conversion-semantics"></div>
489 OpenPower conversion semantics (section A.2 page 1009 (page 1035) of
490 Power ISA v3.1B):
491
492 ```
493 def fp_to_int_open_power<fp, int>(v: fp) -> int:
494 if v is NaN:
495 return int::MIN_VALUE
496 if v >= int::MAX_VALUE:
497 return int::MAX_VALUE
498 if v <= int::MIN_VALUE:
499 return int::MIN_VALUE
500 return (int)rint(v, rounding_mode)
501 ```
502
503 <div id="fp-to-int-java-saturating-conversion-semantics"></div>
504 [Java/Saturating conversion semantics](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
505 (only for long/int results)
506 (with adjustment to add non-truncate rounding modes):
507
508 ```
509 def fp_to_int_java_saturating<fp, int>(v: fp) -> int:
510 if v is NaN:
511 return 0
512 if v >= int::MAX_VALUE:
513 return int::MAX_VALUE
514 if v <= int::MIN_VALUE:
515 return int::MIN_VALUE
516 return (int)rint(v, rounding_mode)
517 ```
518
519 <div id="fp-to-int-javascript-conversion-semantics"></div>
520 Section 7.1 of the ECMAScript / JavaScript
521 [conversion semantics](https://262.ecma-international.org/11.0/#sec-toint32)
522 (with adjustment to add non-truncate rounding modes):
523
524 ```
525 def fp_to_int_java_script<fp, int>(v: fp) -> int:
526 if v is NaN or infinite:
527 return 0
528 v = rint(v, rounding_mode) # assume no loss of precision in result
529 v = v mod int::VALUE_COUNT # 2^32 for i32, 2^64 for i64, result is non-negative
530 bits = (uint)v
531 return (int)bits
532 ```
533
534
535 ----------
536
537 \newpage{}
538
539
540 ## Double-Precision Floating Convert To Integer In GPR
541
542 ```
543 fcvttg RT, FRB, CVM, IT
544 fcvttg. RT, FRB, CVM, IT
545 fcvttgo RT, FRB, CVM, IT
546 fcvttgo. RT, FRB, CVM, IT
547 ```
548
549 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-29 | 30 | 31 | Form |
550 |-----|------|-------|-------|-------|-------|----|----|---------|
551 | PO | RT | IT | CVM | FRB | XO | OE | Rc | XO-Form |
552
553 ```
554 # based on xscvdpuxws
555 reset_xflags()
556 src <- bfp_CONVERT_FROM_BFP64((FRB))
557
558 switch(IT)
559 case(0): # Signed 32-bit
560 range_min <- bfp_CONVERT_FROM_SI32(0x8000_0000)
561 range_max <- bfp_CONVERT_FROM_SI32(0x7FFF_FFFF)
562 js_mask <- 0xFFFF_FFFF
563 case(1): # Unsigned 32-bit
564 range_min <- bfp_CONVERT_FROM_UI32(0)
565 range_max <- bfp_CONVERT_FROM_UI32(0xFFFF_FFFF)
566 js_mask <- 0xFFFF_FFFF
567 case(2): # Signed 64-bit
568 range_min <- bfp_CONVERT_FROM_SI64(-0x8000_0000_0000_0000)
569 range_max <- bfp_CONVERT_FROM_SI64(0x7FFF_FFFF_FFFF_FFFF)
570 js_mask <- 0xFFFF_FFFF_FFFF_FFFF
571 default: # Unsigned 64-bit
572 range_min <- bfp_CONVERT_FROM_UI64(0)
573 range_max <- bfp_CONVERT_FROM_UI64(0xFFFF_FFFF_FFFF_FFFF)
574 js_mask <- 0xFFFF_FFFF_FFFF_FFFF
575
576 if CVM[2] = 1 or FPSCR.RN = 0b01 then
577 rnd <- bfp_ROUND_TO_INTEGER_TRUNC(src)
578 else if FPSCR.RN = 0b00 then
579 rnd <- bfp_ROUND_TO_INTEGER_NEAR_EVEN(src)
580 else if FPSCR.RN = 0b10 then
581 rnd <- bfp_ROUND_TO_INTEGER_CEIL(src)
582 else if FPSCR.RN = 0b11 then
583 rnd <- bfp_ROUND_TO_INTEGER_FLOOR(src)
584
585 switch(CVM)
586 case(0, 1): # OpenPower semantics
587 if IsNaN(rnd) then
588 result <- si64_CONVERT_FROM_BFP(range_min)
589 else if bfp_COMPARE_GT(rnd, range_max) then
590 result <- ui64_CONVERT_FROM_BFP(range_max)
591 else if bfp_COMPARE_LT(rnd, range_min) then
592 result <- si64_CONVERT_FROM_BFP(range_min)
593 else if IT[1] = 1 then # Unsigned 32/64-bit
594 result <- ui64_CONVERT_FROM_BFP(range_max)
595 else # Signed 32/64-bit
596 result <- si64_CONVERT_FROM_BFP(range_max)
597 case(2, 3): # Java/Saturating semantics
598 if IsNaN(rnd) then
599 result <- [0] * 64
600 else if bfp_COMPARE_GT(rnd, range_max) then
601 result <- ui64_CONVERT_FROM_BFP(range_max)
602 else if bfp_COMPARE_LT(rnd, range_min) then
603 result <- si64_CONVERT_FROM_BFP(range_min)
604 else if IT[1] = 1 then # Unsigned 32/64-bit
605 result <- ui64_CONVERT_FROM_BFP(range_max)
606 else # Signed 32/64-bit
607 result <- si64_CONVERT_FROM_BFP(range_max)
608 default: # JavaScript semantics
609 # CVM = 6, 7 are illegal instructions
610 # this works because the largest type we try to convert from has
611 # 53 significand bits, and the largest type we try to convert to
612 # has 64 bits, and the sum of those is strictly less than the 128
613 # bits of the intermediate result.
614 limit <- bfp_CONVERT_FROM_UI128([1] * 128)
615 if IsInf(rnd) or IsNaN(rnd) then
616 result <- [0] * 64
617 else if bfp_COMPARE_GT(bfp_ABSOLUTE(rnd), limit) then
618 result <- [0] * 64
619 else
620 result128 <- si128_CONVERT_FROM_BFP(rnd)
621 result <- result128[64:127] & js_mask
622
623 switch(IT)
624 case(0): # Signed 32-bit
625 result <- EXTS64(result[32:63])
626 result_bfp <- bfp_CONVERT_FROM_SI32(result[32:63])
627 case(1): # Unsigned 32-bit
628 result <- EXTZ64(result[32:63])
629 result_bfp <- bfp_CONVERT_FROM_UI32(result[32:63])
630 case(2): # Signed 64-bit
631 result_bfp <- bfp_CONVERT_FROM_SI64(result)
632 default: # Unsigned 64-bit
633 result_bfp <- bfp_CONVERT_FROM_UI64(result)
634
635 if vxsnan_flag = 1 then SetFX(FPSCR.VXSNAN)
636 if vxcvi_flag = 1 then SetFX(FPSCR.VXCVI)
637 if xx_flag = 1 then SetFX(FPSCR.XX)
638
639 vx_flag <- vxsnan_flag | vxcvi_flag
640 vex_flag <- FPSCR.VE & vx_flag
641
642 if vex_flag = 0 then
643 RT <- result
644 FPSCR.FPRF <- undefined
645 FPSCR.FR <- inc_flag
646 FPSCR.FI <- xx_flag
647 if IsNaN(src) or not bfp_COMPARE_EQ(src, result_bfp) then
648 overflow <- 1 # signals SO only when OE = 1
649 else
650 FPSCR.FR <- 0
651 FPSCR.FI <- 0
652 ```
653
654 Convert from 64-bit float in FRB to a unsigned/signed 32/64-bit integer
655 in RT, with the conversion overflow/rounding semantics following the
656 chosen `CVM` value. `FPSCR` is modified and exceptions are raised as usual.
657
658 These instructions have an Rc=1 mode which sets CR0 in the normal
659 way for any instructions producing a GPR result. Additionally, when OE=1,
660 if the numerical value of the FP number is not 100% accurately preserved
661 (due to truncation or saturation and including when the FP number was
662 NaN) then this is considered to be an Integer Overflow condition, and
663 CR0.SO, XER.SO and XER.OV are all set as normal for any GPR instructions
664 that overflow.
665
666 Special Registers altered:
667
668 ```
669 CR0 (if Rc=1)
670 XER SO, OV, OV32 (if OE=1)
671 FPCSR (TODO: which bits?)
672 ```
673
674 ### Assembly Aliases
675
676 | Assembly Alias | Full Instruction | Assembly Alias | Full Instruction |
677 |---------------------------|----------------------------|---------------------------|----------------------------|
678 | `fcvttgw RT, FRB, CVM` | `fcvttg RT, FRB, CVM, 0` | `fcvttgd RT, FRB, CVM` | `fcvttg RT, FRB, CVM, 2` |
679 | `fcvttgw. RT, FRB, CVM` | `fcvttg. RT, FRB, CVM, 0` | `fcvttgd. RT, FRB, CVM` | `fcvttg. RT, FRB, CVM, 2` |
680 | `fcvttgwo RT, FRB, CVM` | `fcvttgo RT, FRB, CVM, 0` | `fcvttgdo RT, FRB, CVM` | `fcvttgo RT, FRB, CVM, 2` |
681 | `fcvttgwo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 0` | `fcvttgdo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 2` |
682 | `fcvttguw RT, FRB, CVM` | `fcvttg RT, FRB, CVM, 1` | `fcvttgud RT, FRB, CVM` | `fcvttg RT, FRB, CVM, 3` |
683 | `fcvttguw. RT, FRB, CVM` | `fcvttg. RT, FRB, CVM, 1` | `fcvttgud. RT, FRB, CVM` | `fcvttg. RT, FRB, CVM, 3` |
684 | `fcvttguwo RT, FRB, CVM` | `fcvttgo RT, FRB, CVM, 1` | `fcvttgudo RT, FRB, CVM` | `fcvttgo RT, FRB, CVM, 3` |
685 | `fcvttguwo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 1` | `fcvttgudo. RT, FRB, CVM` | `fcvttgo. RT, FRB, CVM, 3` |
686
687 ----------
688
689 \newpage{}
690
691 ## Floating Convert Single To Integer In GPR
692
693 ```
694 fcvtstg RT, FRB, CVM, IT
695 fcvtstg. RT, FRB, CVM, IT
696 fcvtstgo RT, FRB, CVM, IT
697 fcvtstgo. RT, FRB, CVM, IT
698 ```
699
700 | 0-5 | 6-10 | 11-12 | 13-15 | 16-20 | 21-29 | 30 | 31 | Form |
701 |-----|------|-------|-------|-------|-------|----|----|---------|
702 | PO | RT | IT | CVM | FRB | XO | OE | Rc | XO-Form |
703
704 ```
705 # based on xscvdpuxws
706 reset_xflags()
707 src <- bfp_CONVERT_FROM_BFP32(SINGLE((FRB)))
708
709 switch(IT)
710 case(0): # Signed 32-bit
711 range_min <- bfp_CONVERT_FROM_SI32(0x8000_0000)
712 range_max <- bfp_CONVERT_FROM_SI32(0x7FFF_FFFF)
713 js_mask <- 0xFFFF_FFFF
714 case(1): # Unsigned 32-bit
715 range_min <- bfp_CONVERT_FROM_UI32(0)
716 range_max <- bfp_CONVERT_FROM_UI32(0xFFFF_FFFF)
717 js_mask <- 0xFFFF_FFFF
718 case(2): # Signed 64-bit
719 range_min <- bfp_CONVERT_FROM_SI64(-0x8000_0000_0000_0000)
720 range_max <- bfp_CONVERT_FROM_SI64(0x7FFF_FFFF_FFFF_FFFF)
721 js_mask <- 0xFFFF_FFFF_FFFF_FFFF
722 default: # Unsigned 64-bit
723 range_min <- bfp_CONVERT_FROM_UI64(0)
724 range_max <- bfp_CONVERT_FROM_UI64(0xFFFF_FFFF_FFFF_FFFF)
725 js_mask <- 0xFFFF_FFFF_FFFF_FFFF
726
727 if CVM[2] = 1 or FPSCR.RN = 0b01 then
728 rnd <- bfp_ROUND_TO_INTEGER_TRUNC(src)
729 else if FPSCR.RN = 0b00 then
730 rnd <- bfp_ROUND_TO_INTEGER_NEAR_EVEN(src)
731 else if FPSCR.RN = 0b10 then
732 rnd <- bfp_ROUND_TO_INTEGER_CEIL(src)
733 else if FPSCR.RN = 0b11 then
734 rnd <- bfp_ROUND_TO_INTEGER_FLOOR(src)
735
736 switch(CVM)
737 case(0, 1): # OpenPower semantics
738 if IsNaN(rnd) then
739 result <- si64_CONVERT_FROM_BFP(range_min)
740 else if bfp_COMPARE_GT(rnd, range_max) then
741 result <- ui64_CONVERT_FROM_BFP(range_max)
742 else if bfp_COMPARE_LT(rnd, range_min) then
743 result <- si64_CONVERT_FROM_BFP(range_min)
744 else if IT[1] = 1 then # Unsigned 32/64-bit
745 result <- ui64_CONVERT_FROM_BFP(range_max)
746 else # Signed 32/64-bit
747 result <- si64_CONVERT_FROM_BFP(range_max)
748 case(2, 3): # Java/Saturating semantics
749 if IsNaN(rnd) then
750 result <- [0] * 64
751 else if bfp_COMPARE_GT(rnd, range_max) then
752 result <- ui64_CONVERT_FROM_BFP(range_max)
753 else if bfp_COMPARE_LT(rnd, range_min) then
754 result <- si64_CONVERT_FROM_BFP(range_min)
755 else if IT[1] = 1 then # Unsigned 32/64-bit
756 result <- ui64_CONVERT_FROM_BFP(range_max)
757 else # Signed 32/64-bit
758 result <- si64_CONVERT_FROM_BFP(range_max)
759 default: # JavaScript semantics
760 # CVM = 6, 7 are illegal instructions
761 # this works because the largest type we try to convert from has
762 # 53 significand bits, and the largest type we try to convert to
763 # has 64 bits, and the sum of those is strictly less than the 128
764 # bits of the intermediate result.
765 limit <- bfp_CONVERT_FROM_UI128([1] * 128)
766 if IsInf(rnd) or IsNaN(rnd) then
767 result <- [0] * 64
768 else if bfp_COMPARE_GT(bfp_ABSOLUTE(rnd), limit) then
769 result <- [0] * 64
770 else
771 result128 <- si128_CONVERT_FROM_BFP(rnd)
772 result <- result128[64:127] & js_mask
773
774 switch(IT)
775 case(0): # Signed 32-bit
776 result <- EXTS64(result[32:63])
777 result_bfp <- bfp_CONVERT_FROM_SI32(result[32:63])
778 case(1): # Unsigned 32-bit
779 result <- EXTZ64(result[32:63])
780 result_bfp <- bfp_CONVERT_FROM_UI32(result[32:63])
781 case(2): # Signed 64-bit
782 result_bfp <- bfp_CONVERT_FROM_SI64(result)
783 default: # Unsigned 64-bit
784 result_bfp <- bfp_CONVERT_FROM_UI64(result)
785
786 if vxsnan_flag = 1 then SetFX(FPSCR.VXSNAN)
787 if vxcvi_flag = 1 then SetFX(FPSCR.VXCVI)
788 if xx_flag = 1 then SetFX(FPSCR.XX)
789
790 vx_flag <- vxsnan_flag | vxcvi_flag
791 vex_flag <- FPSCR.VE & vx_flag
792
793 if vex_flag = 0 then
794 RT <- result
795 FPSCR.FPRF <- undefined
796 FPSCR.FR <- inc_flag
797 FPSCR.FI <- xx_flag
798 if IsNaN(src) or not bfp_COMPARE_EQ(src, result_bfp) then
799 overflow <- 1 # signals SO only when OE = 1
800 else
801 FPSCR.FR <- 0
802 FPSCR.FI <- 0
803 ```
804
805 Convert from 32-bit float in FRB to a unsigned/signed 32/64-bit integer
806 in RT, with the conversion overflow/rounding semantics following the
807 chosen `CVM` value, following the usual 32-bit float in 64-bit float
808 format. `FPSCR` is modified and exceptions are raised as usual.
809
810 These instructions have an Rc=1 mode which sets CR0 in the normal
811 way for any instructions producing a GPR result. Additionally, when OE=1,
812 if the numerical value of the FP number is not 100% accurately preserved
813 (due to truncation or saturation and including when the FP number was
814 NaN) then this is considered to be an Integer Overflow condition, and
815 CR0.SO, XER.SO and XER.OV are all set as normal for any GPR instructions
816 that overflow.
817
818 Special Registers altered:
819
820 ```
821 CR0 (if Rc=1)
822 XER SO, OV, OV32 (if OE=1)
823 FPCSR (TODO: which bits?)
824 ```
825
826 ### Assembly Aliases
827
828 | Assembly Alias | Full Instruction | Assembly Alias | Full Instruction |
829 |----------------------------|-----------------------------|----------------------------|-----------------------------|
830 | `fcvtstgw RT, FRB, CVM` | `fcvtstg RT, FRB, CVM, 0` | `fcvtstgd RT, FRB, CVM` | `fcvtstg RT, FRB, CVM, 2` |
831 | `fcvtstgw. RT, FRB, CVM` | `fcvtstg. RT, FRB, CVM, 0` | `fcvtstgd. RT, FRB, CVM` | `fcvtstg. RT, FRB, CVM, 2` |
832 | `fcvtstgwo RT, FRB, CVM` | `fcvtstgo RT, FRB, CVM, 0` | `fcvtstgdo RT, FRB, CVM` | `fcvtstgo RT, FRB, CVM, 2` |
833 | `fcvtstgwo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 0` | `fcvtstgdo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 2` |
834 | `fcvtstguw RT, FRB, CVM` | `fcvtstg RT, FRB, CVM, 1` | `fcvtstgud RT, FRB, CVM` | `fcvtstg RT, FRB, CVM, 3` |
835 | `fcvtstguw. RT, FRB, CVM` | `fcvtstg. RT, FRB, CVM, 1` | `fcvtstgud. RT, FRB, CVM` | `fcvtstg. RT, FRB, CVM, 3` |
836 | `fcvtstguwo RT, FRB, CVM` | `fcvtstgo RT, FRB, CVM, 1` | `fcvtstgudo RT, FRB, CVM` | `fcvtstgo RT, FRB, CVM, 3` |
837 | `fcvtstguwo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 1` | `fcvtstgudo. RT, FRB, CVM` | `fcvtstgo. RT, FRB, CVM, 3` |
838
839 ----------
840
841 \newpage{}
842
843 ----------
844
845 # Appendices
846
847 Appendix E Power ISA sorted by opcode
848 Appendix F Power ISA sorted by version
849 Appendix G Power ISA sorted by Compliancy Subset
850 Appendix H Power ISA sorted by mnemonic
851
852 |Form| Book | Page | Version | mnemonic | Description |
853 |----|------|------|---------|----------|-------------|
854 |VA | I | # | 3.2B |todo | |
855
856 ----------------
857
858 [[!tag opf_rfc]]