From: Jacob Lifshay Date: Sat, 15 Apr 2023 01:42:09 +0000 (-0700) Subject: fill out ls013 min/max fmin/fmax X-Git-Tag: opf_rfc_ls008_v1~4 X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=61d4047a5c4add0edb6068d6f26be2a0a2005bee;p=libreriscv.git fill out ls013 min/max fmin/fmax --- diff --git a/openpower/sv/rfc/ls013.mdwn b/openpower/sv/rfc/ls013.mdwn index 192c598c7..08e7ec363 100644 --- a/openpower/sv/rfc/ls013.mdwn +++ b/openpower/sv/rfc/ls013.mdwn @@ -14,7 +14,7 @@ **Target**: v3.2B -**Source**: v3.0B +**Source**: v3.1B **Books and Section affected**: @@ -52,7 +52,7 @@ **Keywords**: ``` - GPR, FPR, minmax + GPR, FPR, min, max, fmin, fmax ``` **Motivation** @@ -61,19 +61,292 @@ TODO **Notes and Observations**: -1. TODO +1. minimum/maximum instructions are needed for vector reductions, where the SVP64 tree reduction needs a single instruction to work properly. +2. if you implement any of the FP min/max modes, the rest are not much more hardware. +3. FP min/max are rather complex to implement in software, the most commonly used FP max function `fmax` from glibc compiled for SFFS is 32 (!) instructions. + +https://gcc.godbolt.org/z/6xba61To6 + +``` + fmax(double, double): + fcmpu 0,1,2 + fmr 0,1 + cror 30,1,2 + beq 7,.L12 + blt 0,.L13 + stfd 1,-16(1) + lis 9,0x8 + li 8,-1 + sldi 9,9,32 + rldicr 8,8,0,11 + ori 2,2,0 + ld 10,-16(1) + xor 10,10,9 + sldi 10,10,1 + cmpld 0,10,8 + bgt 0,.L5 + stfd 2,-16(1) + ori 2,2,0 + ld 10,-16(1) + xor 9,10,9 + sldi 9,9,1 + cmpld 0,9,8 + ble 0,.L6 +.L5: + fadd 1,0,2 + blr +.L13: + fmr 1,2 + blr +.L6: + fcmpu 0,2,2 + fmr 1,2 + bnulr 0 +.L12: + fmr 1,0 + blr + .long 0 + .byte 0,9,0,0,0,0,0,0 +``` **Changes** Add the following entries to: * the Appendices of Book I -* Instructions of Book I added to Section 3.3.14.2 +* Book I 3.3.9 Fixed-Point Arithmetic Instructions +* Book I 4.6.6.1 Floating-Point Elementary Arithmetic Instructions +* Book I 1.6.1 and 1.6.2 ---------------- \newpage{} +## `FMM` -- Floating Min/Max Mode + + + +| `FMM` | Assembly Alias | Origin | Semantics | +|-------|-------------------------------|--------------------------------|-------------------------------------------------| +| 0000 | fminnum08[s] FRT, FRA, FRB | IEEE 754-2008 | FRT = minNum(FRA, FRB) (1) | +| 0001 | fmin19[s] FRT, FRA, FRB | IEEE 754-2019 | FRT = minimum(FRA, FRB) | +| 0010 | fminnum19[s] FRT, FRA, FRB | IEEE 754-2019 | FRT = minimumNumber(FRA, FRB) | +| 0011 | fminc[s] FRT, FRA, FRB | x86 minss or Win32's min macro | FRT = FRA \< FRB ? FRA : FRB | +| 0100 | fminmagnum08[s] FRT, FRA, FRB | IEEE 754-2008 (TODO: (3)) | FRT = minmaxmag(FRA, FRB, False, fminnum08) (2) | +| 0101 | fminmag19[s] FRT, FRA, FRB | IEEE 754-2019 | FRT = minmaxmag(FRA, FRB, False, fmin19) (2) | +| 0110 | fminmagnum19[s] FRT, FRA, FRB | IEEE 754-2019 | FRT = minmaxmag(FRA, FRB, False, fminnum19) (2) | +| 0111 | fminmagc[s] FRT, FRA, FRB | - | FRT = minmaxmag(FRA, FRB, False, fminc) (2) | +| 1000 | fmaxnum08[s] FRT, FRA, FRB | IEEE 754-2008 | FRT = maxNum(FRA, FRB) (1) | +| 1001 | fmax19[s] FRT, FRA, FRB | IEEE 754-2019 | FRT = maximum(FRA, FRB) | +| 1010 | fmaxnum19[s] FRT, FRA, FRB | IEEE 754-2019 | FRT = maximumNumber(FRA, FRB) | +| 1011 | fmaxc[s] FRT, FRA, FRB | x86 maxss or Win32's max macro | FRT = FRA > FRB ? FRA : FRB | +| 1100 | fmaxmagnum08[s] FRT, FRA, FRB | IEEE 754-2008 (TODO: (3)) | FRT = minmaxmag(FRA, FRB, True, fmaxnum08) (2) | +| 1101 | fmaxmag19[s] FRT, FRA, FRB | IEEE 754-2019 | FRT = minmaxmag(FRA, FRB, True, fmax19) (2) | +| 1110 | fmaxmagnum19[s] FRT, FRA, FRB | IEEE 754-2019 | FRT = minmaxmag(FRA, FRB, True, fmaxnum19) (2) | +| 1111 | fmaxmagc[s] FRT, FRA, FRB | - | FRT = minmaxmag(FRA, FRB, True, fmaxc) (2) | + +Note (1): for the purposes of minNum/maxNum, -0.0 is defined to be less than +0.0. This is left unspecified in IEEE 754-2008. + +Note (2): minmaxmag(x, y, cmp, fallback) is defined as: + +```python +def minmaxmag(x, y, is_max, fallback): + a = abs(x) < abs(y) + b = abs(x) > abs(y) + if is_max: + a, b = b, a # swap + if a: + return x + if b: + return y + # equal magnitudes, or NaN input(s) + return fallback(x, y) +``` + +Note (3): TODO: icr if IEEE 754-2008 has min/maxMagNum like IEEE 754-2019's minimum/maximumMagnitudeNumber + +---------------- + +\newpage{} + +## Floating Minimum/Maximum X-Form + +``` + fminmax FRT, FRA, FRB, FMM +``` + +``` + |0 |6 |11 |16 |21 |24 |31 | + | PO | FRT | FRA | FRB | FMM[0:2] | XO | FMM[3] | +``` + +Compute the minimum/maximum of FRA and FRB, according to FMM, and store the result in FRT. + +Assembly Aliases: see [`FMM` -- Floating Min/Max Mode](#fmm-floating-min-max-mode) + +---------- + +## Floating Minimum/Maximum Single X-Form + +``` + fminmaxs FRT, FRA, FRB, FMM +``` + +``` + |0 |6 |11 |16 |21 |24 |31 | + | PO | FRT | FRA | FRB | FMM[0:2] | XO | FMM[3] | +``` + +Compute the minimum/maximum of FRA and FRB, according to FMM, and store the result in FRT. + +Assembly Aliases: see [`FMM` -- Floating Min/Max Mode](#fmm-floating-min-max-mode) + +---------- + +\newpage{} + +## Minimum Unsigned X-Form + +``` + minu RT, RA, RB + minu. RT, RA, RB +``` + +``` + |0 |6 |11 |16 |21 |31 | + | PO | RT | RA | RB | XO | Rc | +``` + +``` + if (RA) u (RB) then + RT <- (RA) + else + RT <- (RB) +``` + +Compute the unsigned maximum of RA and RB and store the result in RT. + +Special Registers altered: + +``` + CR0 (if Rc=1) +``` + +---------- + +\newpage{} + +## Minimum X-Form + +``` + min RT, RA, RB + min. RT, RA, RB +``` + +``` + |0 |6 |11 |16 |21 |31 | + | PO | RT | RA | RB | XO | Rc | +``` + +``` + if (RA) < (RB) then + RT <- (RA) + else + RT <- (RB) +``` + +Compute the signed minimum of RA and RB and store the result in RT. + +Special Registers altered: + +``` + CR0 (if Rc=1) +``` + +---------- + +## Maximum X-Form + +``` + max RT, RA, RB + max. RT, RA, RB +``` + +``` + |0 |6 |11 |16 |21 |31 | + | PO | RT | RA | RB | XO | Rc | +``` + +``` + if (RA) > (RB) then + RT <- (RA) + else + RT <- (RB) +``` + +Compute the signed maximum of RA and RB and store the result in RT. + +Special Registers altered: + +``` + CR0 (if Rc=1) +``` + +---------- + +\newpage{} + +# Instruction Formats + +Add the following entries to Book I 1.6.1.15 X-FORM: + +``` + |0 |6 |11 |16 |21 |24 |31 | + | PO | FRT | FRA | FRB | FMM[0:2] | XO | FMM[3] | +``` + +Add a new field to Book I 1.6.2 Word Instruction Fields: + +``` + FMM (21:23,31) + Field used to specify minimum/maximum mode for fminmax[s]. + + Formats: X +``` + +---------- + +\newpage{} + # Appendices Appendix E Power ISA sorted by opcode @@ -83,9 +356,12 @@ Add the following entries to: | Form | Book | Page | Version | mnemonic | Description | |------|------|------|---------|----------|-------------| -| Z23 | I | # | 3.0B | shadd | Shift-and-Add | -| Z23 | I | # | 3.0B | shaddw | Shift-and-Add Signed Word | -| Z23 | I | # | 3.0B | shadduw | Shift-and-Add Unsigned Word | +| X | I | # | 3.2B | fminmax | Floating Minimum/Maximum | +| X | I | # | 3.2B | fminmaxs | Floating Minimum/Maximum Single | +| X | I | # | 3.2B | minu | Minimum Unsigned | +| X | I | # | 3.2B | maxu | Maximum Unsigned | +| X | I | # | 3.2B | min | Minimum | +| X | I | # | 3.2B | max | Maximum | [[!tag opf_rfc]]