add recip-sqrt as separate proposal
[libreriscv.git] / ztrans_proposal.mdwn
1 # Zftrans - transcendental operations
2
3 See:
4
5 * <http://bugs.libre-riscv.org/show_bug.cgi?id=127>
6 * <https://www.khronos.org/registry/spir-v/specs/unified1/OpenCL.ExtendedInstructionSet.100.html>
7 * Discussion: <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002342.html>
8 * [[rv_major_opcode_1010011]] for opcode listing.
9
10 Extension subsets:
11
12 * **Zftrans**: standard transcendentals (best suited to 3D)
13 * **ZftransExt**: extra functions (useful, not generally needed for 3D,
14 can be synthesised using Ztrans)
15 * **Ztrigpi**: trig. xxx-pi sinpi cospi tanpi
16 * **Ztrignpi**: trig non-xxx-pi sin cos tan
17 * **Zarctrigpi**: arc-trig. a-xxx-pi: atan2pi asinpi acospi
18 * **Zarctrignpi**: arc-trig. non-a-xxx-pi: atan2, asin, acos
19 * **Zfhyp**: hyperbolic/inverse-hyperbolic. sinh, cosh, tanh, asinh,
20 acosh, atanh (can be synthesised - see below)
21 * **ZftransAdv**: much more complex to implement in hardware
22 * **Zfrsqrt**: Reciprocal square-root.
23
24 Minimum recommended requirements for 3D: Zftrans, Ztrigpi, Zarctrigpi,
25 Zarctrignpi
26
27 [[!toc levels=2]]
28
29 # TODO:
30
31 * Decision on accuracy
32 <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002355.html>
33 * Errors **MUST** be repeatable.
34 * How about three Platform Specifications? 3D, UNIX and Embedded?
35 <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002361.html>
36 * Reciprocal Square-root is in its own separate extension (Zfrsqrt) as
37 it is desirable on its own by other implementors. This to be evaluated.
38
39
40 # List of 2-arg opcodes
41
42 [[!table data="""
43 opcode | Description | pseudo-code | Extension |
44 FATAN2 | atan2 arc tangent | rd = atan2(rs2, rs1) | Zarctrignpi |
45 FATAN2PI | atan arc tangent / pi | rd = atan2(rs2, rs1) / pi | Zarctrigpi |
46 FPOW | x power of y | rd = pow(rs1, rs2) | ZftransAdv |
47 FROOT | x power 1/y | rd = pow(rs1, 1/rs2) | ZftransAdv |
48 FHYPOT | hypotenuse | rd = sqrt(rs1^2 + rs2^2) | Zftrans |
49 """]]
50
51 # List of 1-arg transcendental opcodes
52
53 [[!table data="""
54 opcode | Description | pseudo-code | Extension |
55 FRSQRT | Reciprocal Square-root | rd = sqrt(rs1) | Zfrsqrt |
56 FCBRT | Cube Root | rd = pow(rs1, 3) | Zftrans |
57 FEXP2 | power-of-2 | rd = pow(2, rs1) | Zftrans |
58 FLOG2 | log2 | rd = log2(rs1) | Zftrans |
59 FEXPM1 | exponent minus 1 | rd = pow(e, rs1) - 1.0 | Zftrans |
60 FLOG1P | log plus 1 | rd = log(e, 1 + rs1) | Zftrans |
61 FEXP | exponent | rd = pow(e, rs1) | ZftransExt |
62 FLOG | natural log (base e) | rd = log(e, rs1) | ZftransExt |
63 FEXP10 | power-of-10 | rd = pow(10, rs1) | ZftransExt |
64 FLOG10 | log base 10 | rd = log10(rs1) | ZftransExt |
65 """]]
66
67 # List of 1-arg trigonometric opcodes
68
69 [[!table data="""
70 opcode | Description | pseudo-code | Extension |
71 FSIN | sin (radians) | rd = sin(rs1) | Ztrignpi |
72 FCOS | cos (radians) | rd = cos(rs1) | Ztrignpi |
73 FTAN | tan (radians) | rd = tan(rs1) | Ztrignpi |
74 FASIN | arcsin (radians) | rd = asin(rs1) | Zarctrignpi |
75 FACOS | arccos (radians) | rd = acos(rs1) | Zarctrignpi |
76 FSINPI | sin times pi | rd = sin(pi * rs1) | Ztrigpi |
77 FCOSPI | cos times pi | rd = cos(pi * rs1) | Ztrigpi |
78 FTANPI | tan times pi | rd = tan(pi * rs1) | Ztrigpi |
79 FASINPI | arcsin times pi | rd = asin(pi * rs1) | Zarctrigpi |
80 FACOSPI | arccos times pi | rd = acos(pi * rs1) | Zarctrigpi |
81 FATANPI | arctan times pi | rd = atan(pi * rs1) | Zarctrigpi |
82 FSINH | hyperbolic sin (radians) | rd = sinh(rs1) | Zfhyp |
83 FCOSH | hyperbolic cos (radians) | rd = cosh(rs1) | Zfhyp |
84 FTANH | hyperbolic tan (radians) | rd = tanh(rs1) | Zfhyp |
85 FASINH | inverse hyperbolic sin | rd = asinh(rs1) | Zfhyp |
86 FACOSH | inverse hyperbolic cos | rd = acosh(rs1) | Zfhyp |
87 FATANH | inverse hyperbolic tan | rd = atanh(rs1) | Zfhyp |
88 """]]
89
90 # Synthesis, Pseudo-code ops and macro-ops
91
92 The pseudo-ops are best left up to the compiler rather than being actual
93 pseudo-ops, by allocating one scalar FP register for use as a constant
94 (loop invariant) set to "1.0" at the beginning of a function or other
95 suitable code block.
96
97 * FRCP rd, rs1 - pseudo-code alias for rd = 1.0 / rs1
98 * FATAN - pseudo-code alias for rd = atan2(rs1, 1.0) - FATAN2
99 * FATANPI - pseudo alias for rd = atan2pi(rs1, 1.0) - FATAN2PI
100 * FSINCOS - fused macro-op between FSIN and FCOS (issued in that order).
101 * FSINCOSPI - fused macro-op between FSINPI and FCOSPI (issued in that order).
102
103 FATANPI example pseudo-code:
104
105 lui t0, 0x3F800 // upper bits of f32 1.0
106 fmv.x.s ft0, t0
107 fatan2pi.s rd, rs1, ft0
108
109 Hypotenuse example (obviates need for Zfhyp except for high-performance):
110
111 ASINH( x ) = ln( x + SQRT(x**2+1)
112
113 LOG / LOGP1 example:
114
115 LOG(x) = LOGP1(x) + 1.0
116 EXP(x) = EXPM1(x-1.0)
117
118 # To evaluate: should LOG be replaced with LOG1P (and EXP with EXPM1)?
119
120 RISC principle says "exclude LOG because it's covered by LOGP1 plus an ADD".
121 Research needed to ensure that implementors are not compromised by such
122 a decision
123 <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002358.html>