ztrans_proposal.mdwn

   1 # Zftrans - transcendental operations
   2
   3 See:
   4
   5 * <http://bugs.libre-riscv.org/show_bug.cgi?id=127>
   6 * <https://www.khronos.org/registry/spir-v/specs/unified1/OpenCL.ExtendedInstructionSet.100.html>
   7 * [[rv_major_opcode_1010011]] for opcode listing.
   8
   9 Extension subsets:
  10
  11 * **Zftrans**: standard transcendentals (best suited to 3D)
  12 * **ZftransExt**: extra functions (useful, not generally needed for 3D, can be synthesised using Ztrans)
  13 * **Ztrigpi**: trig. xxx-pi sinpi cospi tanpi
  14 * **Ztrignpi**: trig non-xxx-pi sin cos tan
  15 * **Zarctrigpi**: arc-trig. a-xxx-pi: atan2pi asinpi acospi
  16 * **Zarctrignpi**: arc-trig. non-a-xxx-pi: atan2, asin, acos
  17 * **Zfhyp**: hyperbolic/inverse-hyperbolic.  sinh, cosh, tanh, asinh, acosh, atanh
  18 * **ZftransAdv**: much more complex to implement in hardware
  19
  20 Minimum recommended requirements for 3D: Zftrans, Ztrigpi, Zarctrigpi,
  21 Zarctrignpi
  22
  23 [[!toc levels=2]]
  24
  25 # List of 2-arg opcodes
  26
  27 [[!table  data="""
  28 opcode    | Description           | pseudo-code                | Extension |
  29 FATAN2    | atan2 arc tangent     | rd = atan2(rs2, rs1)       | Zarctrignpi |
  30 FATAN2PI  | atan arc tangent / pi | rd = atan2(rs2, rs1) / pi  | Zarctrigpi |
  31 FPOW      | x power of y          | rd = pow(rs1, rs2)         | ZftransAdv |
  32 FROOT     | x power 1/y           | rd = pow(rs1, 1/rs2)       | ZftransAdv |
  33 FHYPOT    | hypotenuse            | rd = sqrt(x^2 + y^2)       | Zftrans    |
  34 """]]
  35
  36 # List of 1-arg transcendental opcodes
  37
  38 [[!table  data="""
  39 opcode   | Description              | pseudo-code             | Extension |
  40 FCBRT    | Cube Root                | rd = pow(rs1, 3)        | Zftrans    |
  41 FEXP2    | power-of-2               | rd = pow(2, rs1)        | Zftrans    |
  42 FLOG2    | log2                     | rd = log2(rs1)          | Zftrans    |
  43 FEXPM1   | exponent minus 1         | rd = pow(e, rs1) - 1.0  | Zftrans    |
  44 FLOG1P   | log plus 1               | rd = log(e, 1 + rs1)    | Zftrans    |
  45 FEXP     | exponent                 | rd = pow(e, rs1)        | ZftransExt |
  46 FLOG     | natural log (base e)     | rd = log(e, rs1)        | ZftransExt |
  47 FEXP10   | power-of-10              | rd = pow(10, rs1)       | ZftransExt |
  48 FLOG10   | log base 10              | rd = log10(rs1)         | ZftransExt |
  49 """]]
  50
  51 # List of 1-arg trigonometric opcodes
  52
  53 [[!table  data="""
  54 opcode   | Description              | pseudo-code             | Extension |
  55 FSIN     | sin (radians)            | rd = sin(rs1)           | Ztrignpi    |
  56 FCOS     | cos (radians)            | rd = cos(rs1)           | Ztrignpi    |
  57 FTAN     | tan (radians)            | rd = tan(rs1)           | Ztrignpi    |
  58 FASIN    | arcsin (radians)         | rd = asin(rs1)          | Zarctrignpi |
  59 FACOS    | arccos (radians)         | rd = acos(rs1)          | Zarctrignpi |
  60 FSINPI   | sin times pi             | rd = sin(pi * rs1)      | Ztrigpi |
  61 FCOSPI   | cos times pi             | rd = cos(pi * rs1)      | Ztrigpi |
  62 FTANPI   | tan times pi             | rd = tan(pi * rs1)      | Ztrigpi |
  63 FASINPI  | arcsin times pi          | rd = asin(pi * rs1)     | Zarctrigpi |
  64 FACOSPI  | arccos times pi          | rd = acos(pi * rs1)     | Zarctrigpi |
  65 FATANPI  | arctan times pi          | rd = atan(pi * rs1)     | Zarctrigpi |
  66 FSINH    | hyperbolic sin (radians) | rd = sinh(rs1)          | Zfhyp |
  67 FCOSH    | hyperbolic cos (radians) | rd = cosh(rs1)          | Zfhyp |
  68 FTANH    | hyperbolic tan (radians) | rd = tanh(rs1)          | Zfhyp |
  69 FASINH   | inverse hyperbolic sin   | rd = asinh(rs1)         | Zfhyp |
  70 FACOSH   | inverse hyperbolic cos   | rd = acosh(rs1)         | Zfhyp |
  71 FATANH   | inverse hyperbolic tan   | rd = atanh(rs1)         | Zfhyp |
  72 """]]
  73
  74 # Pseudo-code ops and macro-ops
  75
  76 The pseudo-ops are best left up to the compiler rather than being actual
  77 pseudo-ops, by allocating one scalar FP register for use as a constant
  78 (loop invariant) set to "1.0" at the beginning of a function or other
  79 suitable code block.
  80
  81 * FRCP rd, rs1 - pseudo-code alias for rd = 1.0 / rs1
  82 * FATAN - pseudo-code alias for rd = atan2(rs1, 1.0) - FATAN2
  83 * FATANPI - pseudo alias for rd = atan2pi(rs1, 1.0) - FATAN2PI
  84 * FSINCOS - fused macro-op between FSIN and FCOS (issued in that order).
  85 * FSINCOSPI - fused macro-op between FSINPI and FCOSPI (issued in that order).
  86
  87 FATANPI example pseudo-code:
  88
  89     lui t0, 0x3F800 // upper bits of f32 1.0
  90     fmv.x.s ft0, t0
  91     fatan2pi.s rd, rs1, ft0
  92