1 # Zftrans - transcendental operations
5 * <http://bugs.libre-riscv.org/show_bug.cgi?id=127>
6 * <https://www.khronos.org/registry/spir-v/specs/unified1/OpenCL.ExtendedInstructionSet.100.html>
7 * Discussion: <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002342.html>
8 * [[rv_major_opcode_1010011]] for opcode listing.
12 * **Zftrans**: standard transcendentals (best suited to 3D)
13 * **ZftransExt**: extra functions (useful, not generally needed for 3D,
14 can be synthesised using Ztrans)
15 * **Ztrigpi**: trig. xxx-pi sinpi cospi tanpi
16 * **Ztrignpi**: trig non-xxx-pi sin cos tan
17 * **Zarctrigpi**: arc-trig. a-xxx-pi: atan2pi asinpi acospi
18 * **Zarctrignpi**: arc-trig. non-a-xxx-pi: atan2, asin, acos
19 * **Zfhyp**: hyperbolic/inverse-hyperbolic. sinh, cosh, tanh, asinh,
20 acosh, atanh (can be synthesised - see below)
21 * **ZftransAdv**: much more complex to implement in hardware
22 * **Zfrsqrt**: Reciprocal square-root.
24 Minimum recommended requirements for 3D: Zftrans, Ztrigpi, Zarctrigpi,
31 * Decision on accuracy
32 <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002355.html>
33 * Errors **MUST** be repeatable.
34 * How about three Platform Specifications? 3D, UNIX and Embedded?
35 <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002361.html>
36 * Reciprocal Square-root is in its own separate extension (Zfrsqrt) as
37 it is desirable on its own by other implementors. This to be evaluated.
40 # List of 2-arg opcodes
43 opcode | Description | pseudo-code | Extension |
44 FATAN2 | atan2 arc tangent | rd = atan2(rs2, rs1) | Zarctrignpi |
45 FATAN2PI | atan arc tangent / pi | rd = atan2(rs2, rs1) / pi | Zarctrigpi |
46 FPOW | x power of y | rd = pow(rs1, rs2) | ZftransAdv |
47 FROOT | x power 1/y | rd = pow(rs1, 1/rs2) | ZftransAdv |
48 FHYPOT | hypotenuse | rd = sqrt(rs1^2 + rs2^2) | Zftrans |
51 # List of 1-arg transcendental opcodes
54 opcode | Description | pseudo-code | Extension |
55 FRSQRT | Reciprocal Square-root | rd = sqrt(rs1) | Zfrsqrt |
56 FCBRT | Cube Root | rd = pow(rs1, 3) | Zftrans |
57 FEXP2 | power-of-2 | rd = pow(2, rs1) | Zftrans |
58 FLOG2 | log2 | rd = log2(rs1) | Zftrans |
59 FEXPM1 | exponent minus 1 | rd = pow(e, rs1) - 1.0 | Zftrans |
60 FLOG1P | log plus 1 | rd = log(e, 1 + rs1) | Zftrans |
61 FEXP | exponent | rd = pow(e, rs1) | ZftransExt |
62 FLOG | natural log (base e) | rd = log(e, rs1) | ZftransExt |
63 FEXP10 | power-of-10 | rd = pow(10, rs1) | ZftransExt |
64 FLOG10 | log base 10 | rd = log10(rs1) | ZftransExt |
67 # List of 1-arg trigonometric opcodes
70 opcode | Description | pseudo-code | Extension |
71 FSIN | sin (radians) | rd = sin(rs1) | Ztrignpi |
72 FCOS | cos (radians) | rd = cos(rs1) | Ztrignpi |
73 FTAN | tan (radians) | rd = tan(rs1) | Ztrignpi |
74 FASIN | arcsin (radians) | rd = asin(rs1) | Zarctrignpi |
75 FACOS | arccos (radians) | rd = acos(rs1) | Zarctrignpi |
76 FSINPI | sin times pi | rd = sin(pi * rs1) | Ztrigpi |
77 FCOSPI | cos times pi | rd = cos(pi * rs1) | Ztrigpi |
78 FTANPI | tan times pi | rd = tan(pi * rs1) | Ztrigpi |
79 FASINPI | arcsin times pi | rd = asin(pi * rs1) | Zarctrigpi |
80 FACOSPI | arccos times pi | rd = acos(pi * rs1) | Zarctrigpi |
81 FATANPI | arctan times pi | rd = atan(pi * rs1) | Zarctrigpi |
82 FSINH | hyperbolic sin (radians) | rd = sinh(rs1) | Zfhyp |
83 FCOSH | hyperbolic cos (radians) | rd = cosh(rs1) | Zfhyp |
84 FTANH | hyperbolic tan (radians) | rd = tanh(rs1) | Zfhyp |
85 FASINH | inverse hyperbolic sin | rd = asinh(rs1) | Zfhyp |
86 FACOSH | inverse hyperbolic cos | rd = acosh(rs1) | Zfhyp |
87 FATANH | inverse hyperbolic tan | rd = atanh(rs1) | Zfhyp |
90 # Synthesis, Pseudo-code ops and macro-ops
92 The pseudo-ops are best left up to the compiler rather than being actual
93 pseudo-ops, by allocating one scalar FP register for use as a constant
94 (loop invariant) set to "1.0" at the beginning of a function or other
97 * FRCP rd, rs1 - pseudo-code alias for rd = 1.0 / rs1
98 * FATAN - pseudo-code alias for rd = atan2(rs1, 1.0) - FATAN2
99 * FATANPI - pseudo alias for rd = atan2pi(rs1, 1.0) - FATAN2PI
100 * FSINCOS - fused macro-op between FSIN and FCOS (issued in that order).
101 * FSINCOSPI - fused macro-op between FSINPI and FCOSPI (issued in that order).
103 FATANPI example pseudo-code:
105 lui t0, 0x3F800 // upper bits of f32 1.0
107 fatan2pi.s rd, rs1, ft0
109 Hypotenuse example (obviates need for Zfhyp except for high-performance):
111 ASINH( x ) = ln( x + SQRT(x**2+1)
115 LOG(x) = LOGP1(x) + 1.0
116 EXP(x) = EXPM1(x-1.0)
118 # To evaluate: should LOG be replaced with LOG1P (and EXP with EXPM1)?
120 RISC principle says "exclude LOG because it's covered by LOGP1 plus an ADD".
121 Research needed to ensure that implementors are not compromised by such
123 <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002358.html>