break down table, add Ztrig* extensions
[libreriscv.git] / ztrans_proposal.mdwn
1 # Ztrans - transcendental operations
2
3 See:
4
5 * <http://bugs.libre-riscv.org/show_bug.cgi?id=127>
6 * <https://www.khronos.org/registry/spir-v/specs/unified1/OpenCL.ExtendedInstructionSet.100.html>
7
8 Extension subsets:
9
10 * **Ztrans**: standard transcendentals (best suited to 3D)
11 * **ZtransExt**: extra functions (useful, not generally needed for 3D)
12 * **Ztrigpi**: trig. xxx-pi sinpi cospi tanpi
13 * **Ztrignpi**: trig non-xxx-pi sin cos tan
14 * **Zarctrigpi**: arc-trig. a-xxx-pi: atan2pi asinpi acospi
15 * **Zarctrignpi**: arc-trig. non-a-xxx-pi: atan2, asin, acos
16 * **Ztrignh**: trig/arc-trig hyperbolic. sinh cosh tanh, asinh, acosh, atanh
17 * **ZtransAdv**: much more complex to implement in hardware
18
19 Minimum recommended requirements for 3D: Ztrans, Ztrigpi, Zarctrigpi,
20 Zarctrignpi
21
22 [[!toc levels=2]]
23
24 # List of 2-arg opcodes
25
26 [[!table data="""
27 opcode | Description | pseudo-code | Extension |
28 FATAN2 | atan2 arc tangent | rd = atan2(rs2, rs1) | Zarctrignpi |
29 FATAN2PI | atan arc tangent / pi | rd = atan2(rs2, rs1) / pi | Zarctrigpi |
30 FPOW | x power of y | rd = pow(rs1, rs2) | ZtransAdv |
31 FROOT | x power 1/y | rd = pow(rs1, 1/rs2) | ZtransAdv |
32 """]]
33
34 # List of 1-arg transcendental opcodes
35
36 [[!table data="""
37 opcode | Description | pseudo-code | Extension |
38 FCBRT | Cube Root | rd = pow(rs1, 3) | Ztrans |
39 FEXP2 | power-of-2 | rd = pow(2, rs1) | Ztrans |
40 FLOG2 | log2 | rd = log2(rs1) | Ztrans |
41 FEXPM1 | exponent minus 1 | rd = pow(e, rs1) - 1.0 | Ztrans |
42 FLOG1P | log plus 1 | rd = log(e, 1 + rs1) | Ztrans |
43 FEXP | exponent | rd = pow(e, rs1) | ZtransExt |
44 FLOG | natural log (base e) | rd = log(e, rs1) | ZtransExt |
45 FEXP10 | power-of-10 | rd = pow(10, rs1) | ZtransExt |
46 FLOG10 | log base 10 | rd = log10(rs1) | ZtransExt |
47 """]]
48
49 # List of 1-arg trigonometric opcodes
50
51 [[!table data="""
52 opcode | Description | pseudo-code | Extension |
53 FSIN | sin (radians) | rd = sin(rs1) | Ztrignpi |
54 FCOS | cos (radians) | rd = cos(rs1) | Ztrignpi |
55 FTAN | tan (radians) | rd = tan(rs1) | Ztrignpi |
56 FASIN | arcsin (radians) | rd = asin(rs1) | Zarctrignpi |
57 FACOS | arccos (radians) | rd = acos(rs1) | Zarctrignpi |
58 FSINPI | sin times pi | rd = sin(pi * rs1) | Ztrigpi |
59 FCOSPI | cos times pi | rd = cos(pi * rs1) | Ztrigpi |
60 FTANPI | tan times pi | rd = tan(pi * rs1) | Ztrigpi |
61 FASINPI | arcsin times pi | rd = asin(pi * rs1) | Zarctrigpi |
62 FACOSPI | arccos times pi | rd = acos(pi * rs1) | Zarctrigpi |
63 FATANPI | arctan times pi | rd = atan(pi * rs1) | Zarctrigpi |
64 FSINH | hyperbolic sin (radians) | rd = sinh(rs1) | Ztrigh |
65 FCOSH | hyperbolic cos (radians) | rd = cosh(rs1) | Ztrigh |
66 FTANH | hyperbolic tan (radians) | rd = tanh(rs1) | Ztrigh |
67 FASINH | inverse hyperbolic sin | rd = asinh(rs1) | Ztrigh |
68 FACOSH | inverse hyperbolic cos | rd = acosh(rs1) | Ztrigh |
69 FATANH | inverse hyperbolic tan | rd = atanh(rs1) | Ztrigh |
70 """]]
71
72 # Pseudo-code ops and macro-ops
73
74 The pseudo-ops are best left up to the compiler rather than being actual
75 pseudo-ops, by allocating one scalar FP register for use as a constant
76 (loop invariant) set to "1.0" at the beginning of a function or other
77 suitable code block.
78
79 * FRCP rd, rs1 - pseudo-code alias for rd = 1.0 / rs1
80 * FATAN - pseudo-code alias for rd = atan2(rs1, 1.0) - FATAN2
81 * FATANPI - pseudo alias for rd = atan2pi(rs1, 1.0) - FATAN2PI
82 * FSINCOS - fused macro-op between FSIN and FCOS (issued in that order).
83 * FSINCOSPI - fused macro-op between FSINPI and FCOSPI (issued in that order).
84
85 FATANPI example pseudo-code:
86
87 lui t0, 0x3F800 // upper bits of f32 1.0
88 fmv.x.s ft0, t0
89 fatan2pi.s rd, rs1, ft0
90