1ee99d5421ce4788177ce305e7bcca99009c4af3
[libreriscv.git] / openpower / sv / rfc / ls012.mdwn
1 # External RFC ls012: Discuss priorities of Libre-SOC Scalar(Vector) ops
2
3 * <https://git.openpower.foundation/isa/PowerISA/issues/121>
4 * <https://bugs.libre-soc.org/show_bug.cgi?id=1051>
5 * <https://bugs.libre-soc.org/show_bug.cgi?id=1052>
6
7 The purpose of this RFC is to give a full list of the upcoming Scalar
8 opcodes developed by Libre-SOC, formally agree a priority order, which
9 ones should be EXT022 Sandbox, and for IBM to get a clear picture of
10 the Opcode Allocation needs. Worth bearing in mind that every "Defined
11 Word" may or may not be Vectoriseable, but that every "Defined Word"
12 should have merits on its own not just when Vectorised. An example
13 of a borderline Vectoriseable Defined Word is `mv.swizzle` which
14 only really becomes high-priority for Vector GPU and HPC Workloads,
15 but has less merit as a Scalar-only operation.
16
17 Instruction count guide and approximate priority order:
18
19 * 6 - SVP64 Management [[ls008]] [[ls009]] [[ls010]]
20 * 5 - CR weirds
21 * 4 - INT<->FP mv [[ls006]]
22 * 19 - GPR LD/ST-PostIncrement-Update (saves hugely in hot-loops) [[ls011]]
23 * ~12 - FPR LD/ST-PostIncrement-Update (ditto) [[ls011]]
24 * 2 - Float-Load-Immediate (always saves one LD L1/2/3 D-Cache op) [[ls002]]
25 * 5 - Big-Integer Chained 3-in 2-out (64-bit Carry)
26 * 6 - Bitmanip LUT2/3 operations. high cost high reward
27 * 1 - fclass (Scalar variant of xvtstdcsp)
28 * 5 - Audio-Video
29 * 2 - Shift-and-Add (mitigates LD-ST-Shift; Cryptography e.g. twofish)
30 * 2 - BMI group
31 * 2 - GPU swizzle
32 * 9 - FP DCT/FFT Butterfly (2/3-in 2-out)
33 * ~9 Integer DCT/FFT Butterfly
34 * 18 - Trigonometric (1-arg)
35 * 15 - Transcendentals (1-arg)
36 * 25 - Transcendentals (2-arg)
37
38 [[!inline pages="openpower/sv/rfc/ls012/areas.mdwn" raw=yes ]]
39