(no commit message)
[libreriscv.git] / ztrans_proposal.mdwn
1 # Ztrans - transcendental operations
2
3 See:
4
5 * <http://bugs.libre-riscv.org/show_bug.cgi?id=127>
6 * <https://www.khronos.org/registry/spir-v/specs/unified1/OpenCL.ExtendedInstructionSet.100.html>
7 * [[rv_major_opcode_1010011]] for opcode listing.
8
9 Extension subsets:
10
11 * **Ztrans**: standard transcendentals (best suited to 3D)
12 * **ZtransExt**: extra functions (useful, not generally needed for 3D, can be synthesised using Ztrans)
13 * **Ztrigpi**: trig. xxx-pi sinpi cospi tanpi
14 * **Ztrignpi**: trig non-xxx-pi sin cos tan
15 * **Zarctrigpi**: arc-trig. a-xxx-pi: atan2pi asinpi acospi
16 * **Zarctrignpi**: arc-trig. non-a-xxx-pi: atan2, asin, acos
17 * **Ztrignh**: trig/arc-trig hyperbolic. sinh cosh tanh, asinh, acosh, atanh
18 * **ZtransAdv**: much more complex to implement in hardware
19
20 Minimum recommended requirements for 3D: Ztrans, Ztrigpi, Zarctrigpi,
21 Zarctrignpi
22
23 [[!toc levels=2]]
24
25 # List of 2-arg opcodes
26
27 [[!table data="""
28 opcode | Description | pseudo-code | Extension |
29 FATAN2 | atan2 arc tangent | rd = atan2(rs2, rs1) | Zarctrignpi |
30 FATAN2PI | atan arc tangent / pi | rd = atan2(rs2, rs1) / pi | Zarctrigpi |
31 FPOW | x power of y | rd = pow(rs1, rs2) | ZtransAdv |
32 FROOT | x power 1/y | rd = pow(rs1, 1/rs2) | ZtransAdv |
33 """]]
34
35 # List of 1-arg transcendental opcodes
36
37 [[!table data="""
38 opcode | Description | pseudo-code | Extension |
39 FCBRT | Cube Root | rd = pow(rs1, 3) | Ztrans |
40 FEXP2 | power-of-2 | rd = pow(2, rs1) | Ztrans |
41 FLOG2 | log2 | rd = log2(rs1) | Ztrans |
42 FEXPM1 | exponent minus 1 | rd = pow(e, rs1) - 1.0 | Ztrans |
43 FLOG1P | log plus 1 | rd = log(e, 1 + rs1) | Ztrans |
44 FEXP | exponent | rd = pow(e, rs1) | ZtransExt |
45 FLOG | natural log (base e) | rd = log(e, rs1) | ZtransExt |
46 FEXP10 | power-of-10 | rd = pow(10, rs1) | ZtransExt |
47 FLOG10 | log base 10 | rd = log10(rs1) | ZtransExt |
48 """]]
49
50 # List of 1-arg trigonometric opcodes
51
52 [[!table data="""
53 opcode | Description | pseudo-code | Extension |
54 FSIN | sin (radians) | rd = sin(rs1) | Ztrignpi |
55 FCOS | cos (radians) | rd = cos(rs1) | Ztrignpi |
56 FTAN | tan (radians) | rd = tan(rs1) | Ztrignpi |
57 FASIN | arcsin (radians) | rd = asin(rs1) | Zarctrignpi |
58 FACOS | arccos (radians) | rd = acos(rs1) | Zarctrignpi |
59 FSINPI | sin times pi | rd = sin(pi * rs1) | Ztrigpi |
60 FCOSPI | cos times pi | rd = cos(pi * rs1) | Ztrigpi |
61 FTANPI | tan times pi | rd = tan(pi * rs1) | Ztrigpi |
62 FASINPI | arcsin times pi | rd = asin(pi * rs1) | Zarctrigpi |
63 FACOSPI | arccos times pi | rd = acos(pi * rs1) | Zarctrigpi |
64 FATANPI | arctan times pi | rd = atan(pi * rs1) | Zarctrigpi |
65 FSINH | hyperbolic sin (radians) | rd = sinh(rs1) | Ztrigh |
66 FCOSH | hyperbolic cos (radians) | rd = cosh(rs1) | Ztrigh |
67 FTANH | hyperbolic tan (radians) | rd = tanh(rs1) | Ztrigh |
68 FASINH | inverse hyperbolic sin | rd = asinh(rs1) | Ztrigh |
69 FACOSH | inverse hyperbolic cos | rd = acosh(rs1) | Ztrigh |
70 FATANH | inverse hyperbolic tan | rd = atanh(rs1) | Ztrigh |
71 """]]
72
73 # Pseudo-code ops and macro-ops
74
75 The pseudo-ops are best left up to the compiler rather than being actual
76 pseudo-ops, by allocating one scalar FP register for use as a constant
77 (loop invariant) set to "1.0" at the beginning of a function or other
78 suitable code block.
79
80 * FRCP rd, rs1 - pseudo-code alias for rd = 1.0 / rs1
81 * FATAN - pseudo-code alias for rd = atan2(rs1, 1.0) - FATAN2
82 * FATANPI - pseudo alias for rd = atan2pi(rs1, 1.0) - FATAN2PI
83 * FSINCOS - fused macro-op between FSIN and FCOS (issued in that order).
84 * FSINCOSPI - fused macro-op between FSINPI and FCOSPI (issued in that order).
85
86 FATANPI example pseudo-code:
87
88 lui t0, 0x3F800 // upper bits of f32 1.0
89 fmv.x.s ft0, t0
90 fatan2pi.s rd, rs1, ft0
91