add accuracy / repeatability notes
[libreriscv.git] / ztrans_proposal.mdwn
1 # Zftrans - transcendental operations
2
3 See:
4
5 * <http://bugs.libre-riscv.org/show_bug.cgi?id=127>
6 * <https://www.khronos.org/registry/spir-v/specs/unified1/OpenCL.ExtendedInstructionSet.100.html>
7 * Discussion: <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002342.html>
8 * [[rv_major_opcode_1010011]] for opcode listing.
9
10 Extension subsets:
11
12 * **Zftrans**: standard transcendentals (best suited to 3D)
13 * **ZftransExt**: extra functions (useful, not generally needed for 3D,
14 can be synthesised using Ztrans)
15 * **Ztrigpi**: trig. xxx-pi sinpi cospi tanpi
16 * **Ztrignpi**: trig non-xxx-pi sin cos tan
17 * **Zarctrigpi**: arc-trig. a-xxx-pi: atan2pi asinpi acospi
18 * **Zarctrignpi**: arc-trig. non-a-xxx-pi: atan2, asin, acos
19 * **Zfhyp**: hyperbolic/inverse-hyperbolic. sinh, cosh, tanh, asinh,
20 acosh, atanh (can be synthesised - see below)
21 * **ZftransAdv**: much more complex to implement in hardware
22
23 Minimum recommended requirements for 3D: Zftrans, Ztrigpi, Zarctrigpi,
24 Zarctrignpi
25
26 [[!toc levels=2]]
27
28 # TODO:
29
30 * Decision on accuracy
31 <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002355.html>
32 * Errors **MUST** be repeatable.
33 * How about three Platform Specifications? 3D, UNIX and Embedded?
34 <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002361.html>
35
36 # List of 2-arg opcodes
37
38 [[!table data="""
39 opcode | Description | pseudo-code | Extension |
40 FATAN2 | atan2 arc tangent | rd = atan2(rs2, rs1) | Zarctrignpi |
41 FATAN2PI | atan arc tangent / pi | rd = atan2(rs2, rs1) / pi | Zarctrigpi |
42 FPOW | x power of y | rd = pow(rs1, rs2) | ZftransAdv |
43 FROOT | x power 1/y | rd = pow(rs1, 1/rs2) | ZftransAdv |
44 FHYPOT | hypotenuse | rd = sqrt(rs1^2 + rs2^2) | Zftrans |
45 """]]
46
47 # List of 1-arg transcendental opcodes
48
49 [[!table data="""
50 opcode | Description | pseudo-code | Extension |
51 FCBRT | Cube Root | rd = pow(rs1, 3) | Zftrans |
52 FEXP2 | power-of-2 | rd = pow(2, rs1) | Zftrans |
53 FLOG2 | log2 | rd = log2(rs1) | Zftrans |
54 FEXPM1 | exponent minus 1 | rd = pow(e, rs1) - 1.0 | Zftrans |
55 FLOG1P | log plus 1 | rd = log(e, 1 + rs1) | Zftrans |
56 FEXP | exponent | rd = pow(e, rs1) | ZftransExt |
57 FLOG | natural log (base e) | rd = log(e, rs1) | ZftransExt |
58 FEXP10 | power-of-10 | rd = pow(10, rs1) | ZftransExt |
59 FLOG10 | log base 10 | rd = log10(rs1) | ZftransExt |
60 """]]
61
62 # List of 1-arg trigonometric opcodes
63
64 [[!table data="""
65 opcode | Description | pseudo-code | Extension |
66 FSIN | sin (radians) | rd = sin(rs1) | Ztrignpi |
67 FCOS | cos (radians) | rd = cos(rs1) | Ztrignpi |
68 FTAN | tan (radians) | rd = tan(rs1) | Ztrignpi |
69 FASIN | arcsin (radians) | rd = asin(rs1) | Zarctrignpi |
70 FACOS | arccos (radians) | rd = acos(rs1) | Zarctrignpi |
71 FSINPI | sin times pi | rd = sin(pi * rs1) | Ztrigpi |
72 FCOSPI | cos times pi | rd = cos(pi * rs1) | Ztrigpi |
73 FTANPI | tan times pi | rd = tan(pi * rs1) | Ztrigpi |
74 FASINPI | arcsin times pi | rd = asin(pi * rs1) | Zarctrigpi |
75 FACOSPI | arccos times pi | rd = acos(pi * rs1) | Zarctrigpi |
76 FATANPI | arctan times pi | rd = atan(pi * rs1) | Zarctrigpi |
77 FSINH | hyperbolic sin (radians) | rd = sinh(rs1) | Zfhyp |
78 FCOSH | hyperbolic cos (radians) | rd = cosh(rs1) | Zfhyp |
79 FTANH | hyperbolic tan (radians) | rd = tanh(rs1) | Zfhyp |
80 FASINH | inverse hyperbolic sin | rd = asinh(rs1) | Zfhyp |
81 FACOSH | inverse hyperbolic cos | rd = acosh(rs1) | Zfhyp |
82 FATANH | inverse hyperbolic tan | rd = atanh(rs1) | Zfhyp |
83 """]]
84
85 # Synthesis, Pseudo-code ops and macro-ops
86
87 The pseudo-ops are best left up to the compiler rather than being actual
88 pseudo-ops, by allocating one scalar FP register for use as a constant
89 (loop invariant) set to "1.0" at the beginning of a function or other
90 suitable code block.
91
92 * FRCP rd, rs1 - pseudo-code alias for rd = 1.0 / rs1
93 * FATAN - pseudo-code alias for rd = atan2(rs1, 1.0) - FATAN2
94 * FATANPI - pseudo alias for rd = atan2pi(rs1, 1.0) - FATAN2PI
95 * FSINCOS - fused macro-op between FSIN and FCOS (issued in that order).
96 * FSINCOSPI - fused macro-op between FSINPI and FCOSPI (issued in that order).
97
98 FATANPI example pseudo-code:
99
100 lui t0, 0x3F800 // upper bits of f32 1.0
101 fmv.x.s ft0, t0
102 fatan2pi.s rd, rs1, ft0
103
104 Hypotenuse example (obviates need for Zfhyp except for high-performance):
105
106 ASINH( x ) = ln( x + SQRT(x**2+1)
107
108 LOG / LOGP1 example:
109
110 LOG(x) = LOGP1(x) + 1.0
111 EXP(x) = EXPM1(x-1.0)
112
113 # To evaluate: should LOG be replaced with LOG1P (and EXP with EXPM1)?
114
115 RISC principle says "exclude LOG because it's covered by LOGP1 plus an ADD".
116 Research needed to ensure that implementors are not compromised by such
117 a decision
118 <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002358.html>