add transparency
[libreriscv.git] / simple_v_extension / open_3d_alliance_2019aug26.tex
1 \documentclass[slidestop]{beamer}
2 \usepackage{beamerthemesplit}
3 \usepackage{graphics}
4 \usepackage{pstricks}
5
6 \title{Open 3D Alliance RISC-V}
7 \author{Luke Kenneth Casson Leighton}
8
9
10 \begin{document}
11
12 \frame{
13 \begin{center}
14 \huge{Open 3D Alliance: RISC-V}\\
15 \vspace{32pt}
16 \Large{An open invitation to collaborate on 3D Graphics}\\
17 \Large{Hardware and Software}\\
18 \Large{for mobile, embedded, and innovative purposes}\\
19 \vspace{24pt}
20 \Large{With thanks to Pixilica, GoWin, and Western Digital}\\
21 \vspace{16pt}
22 \large{\today}
23 \end{center}
24 }
25
26
27 \frame{\frametitle{Why collaborate?}
28
29 \begin{itemize}
30 \item 3D is hard. It's also not the same as HPC\vspace{15pt}
31 \item NVIDIA, AMD, Imagination - cannot meet "unusual" needs\vspace{15pt}
32 \item Working together on flexible standards, everyone wins\vspace{15pt}
33 \item Without collaboration: 10-20 man-years development\vspace{10pt}
34 \item With collaboration: cross-verification (avoids mistakes)
35 \end{itemize}
36 }
37
38
39 \frame{\frametitle{What is the goal?}
40
41 \begin{itemize}
42 \item You get to decide! No, really!\vspace{12pt}
43 \item Outlined here: some ideas and cost/time-saving approaches\vspace{12pt}
44 \item Two new platforms: 3D "Embedded", 3D "UNIX"\vspace{12pt}
45 \item Flexible optional extensions (Transcendentals, Vectors,\\
46 Texturisation, Pixel/Z-Buffers - all optional)\vspace{12pt}
47 \item Good software support absolutely essential\\
48 (basically, that means Vulkan)\vspace{15pt}
49 \end{itemize}
50 }
51
52 \frame{\frametitle{Libre RISC-V Team}
53
54 \begin{itemize}
55 \item Small team, sponsored by Purism and the NLNet Foundation\vspace{8pt}
56 \item Therefore, focus is on efficiency: leap-frogging ahead\\
57 without requiring huge resources.\vspace{8pt}
58 \item OpenGL API? Gallium3D / Vulkan is better\vspace{8pt}
59 \item Gallium3D turns out to be a single-threaded interpreter\\
60 (Vulkan is compiled, and can be parallelised)\vspace{8pt}
61 \item Independent teams have provided OpenGL to Vulkan adaptors\vspace{8pt}
62 \item Same approach on hardware: seek highest bang-per-buck\\
63 Save design time, save implementation time\vspace{8pt}
64 \end{itemize}
65 }
66
67 \frame{\frametitle{What (optional) things are needed?}
68
69 \begin{itemize}
70 \item Vectorisation. (SIMD? RVV? Other?)\vspace{12pt}
71 \item Transcendentals (SIN, COS, EXP, LOG)\vspace{12pt}
72 \item Texture opcodes, Pixel/Z-Buffers\vspace{12pt}
73 \item Pixel conversion (YUV/RGB etc.)\vspace{12pt}
74 \item Optional accuracy (embedded space needs less accuracy)\vspace{12pt}
75 \item Options give implementors flexibility. No imposition:\\
76 imposition risks fragmentation (however, collaboration does\\
77 need some hard easily-logically-justifiable rules)
78 \end{itemize}
79 }
80
81 \frame{\frametitle{What is essential (not really optional)}
82
83 \begin{itemize}
84 \item The software, basically. Anything other than Vulkan\\
85 is a 10+ man-year effort
86 \item Two new 3D "platforms". Vulkan compliance has implications\\
87 for hardware, and, with the API being public, interoperability\\
88 (and Khronos Compliance - which is Trademarked) is critical.
89 \item Respecting that standards are hard to get right\\
90 (and that consequences of mistakes are severe:\\
91 no opportunity for corrections after a freeze)
92 \item Respecting that, for collaboration and interoperability,\\
93 some things go into a standard that you might not "need"
94 \item Mutually respectful open and fully transparent collaboration.\\
95 No NDAs, no "closed forums". We need the help of experts\\
96 (such as Mitch Alsup) in this highly technical specialist area.
97 \end{itemize}
98 }
99
100 \frame{\frametitle{Why Two new Platforms?}
101
102 \begin{itemize}
103 \item Unique pragmatic consequences of "Hybrid" CPU/GPU
104 \item Embedded - no traps need be raised. Interoperability is\\
105 impossible, software toolchain collaboration is incidental).
106 \item UNIX - illegal instruction traps mandatory: software\\
107 interoperability is mandatory and essential.
108 \item 3D Embedded - failure to allow implementors the freedom\\
109 to reduce FP accuracy automatically results in product failure\\
110 (too many gates, too much power, equals end-user rejection).
111 \item 3D UNIX - likewise. Also: failure to comply with Khronos\\
112 Specifications (then use "Vulkan") is a Trademark violation.
113 \item Solution: allow software to select FP accuracy level\\
114 \textbf{at runtime}. (UNIX Platform: IEEE754. 3D UNIX: Vulkan).\\
115 \item HW: slow for IEEE754, fast for 3D. Product now competitive!
116 \end{itemize}
117 }
118
119 \frame{\frametitle{What has our team done already?}
120
121 \begin{itemize}
122 \item Decided to go the "Hybrid" Route (Separate GPUs requires a\\
123 full-blown RPC/IPC mechanism to transfer all 3D API calls\\
124 to and from userspace memory to GPU memory... and back).
125 \item Developed Simple-V (a "Parallelising" API)\\
126 (Simple-V is very hard to describe, because it is unique:\\
127 there is no common Computer Science terminology)
128 \item Started on Kazan (a Vulkan SPIR-V to LLVM compiler)
129 \item Started work on a highly flexible IEEE754 FPU
130 \item Started work on a "Precise" CDC 6600 style OoO Engine,\\
131 with help from Mitch Alsup, the designer of the M68000
132 \item Variable-issue, predicated SIMD backend, Vector front-end\\
133 "precise" exceptions, branch shadowing, much more
134 \item All Libre-licensed and developed publicly and transparently.
135 \end{itemize}
136 }
137
138 \frame{\frametitle{Why Simple-V? Why not RVV?}
139
140 \begin{itemize}
141 \item RVV is designed exclusively for supercomputing\\
142 (RVV simply has not been designed with 3D in mind).\vspace{6pt}
143 \item Like SIMD, RVV uses dedicated opcodes\\
144 (google "SIMD considered harmful")\vspace{6pt}
145 \item 98\% of FP opcodes are duplicated in RVV. Large portion\\
146 of BitManip opcodes duplicated in predicate Masks\vspace{6pt}
147 \item OP32 space is extremely precious: 48 and 64 bit opcode space\\
148 comes with an inherent I-Cache power consumption penalty\vspace{6pt}
149 \item Simple-V "prefixes" scalar opcodes (all of them)\\
150 No need for any new "vector" opcodes (at all).\\
151 Can therefore use the RVV major opcode for 3D\vspace{6pt}
152 \end{itemize}
153 }
154
155 \frame{\frametitle{Simple-V "Prefixing"}
156
157 \begin{itemize}
158 \item SV "Prefix" does exactly that: takes RVC and OP32 opcodes\\
159 and "prefixes" them with predication and a "vector" tag\vspace{8pt}
160 \item Three prefix types: SV P32 (prefixed RVC), P48 and P64\vspace{8pt}
161 \item Prefixed RVC takes 3 "Custom" OP32 opcodes.\\
162 P48 takes standard OP32 scalar opcodes and "prefixes" them\\
163 P64 adds additional vector context on top of P48\\
164 \vspace{8pt}
165 \item "Prefixing" is a bit like SIMD. Vectors may be specified\\
166 of length 2 to 4, elements may be "packed" into registers,\\
167 opcode element widths over-ridden.\vspace{8pt}
168 \item Convenient, but not very space-efficient (and VBLOCK is)\vspace{8pt}
169 \end{itemize}
170 }
171
172 \frame{\frametitle{VBLOCK Format}
173
174 \begin{itemize}
175 \item Again: hard to describe. It is a bit like VLIW (only not really)\\
176 A "block" of instructions is "prefixed" with register "tags"\\
177 which give extra context to scalar instructions within the block
178 \item Sub-blocks include: Vector Length, Swizzling, Vector/Width\\
179 overrides, and predication. All this is added to scalar opcodes!\\
180 \textbf{There are NO vector opcodes} (and no need for any)
181 \item In the "context", it goes like this: "if a register is used\\
182 by a scalar opcode, and the register is listed in the "context",\\
183 SV mode is "activated"
184 \item "Activation" results in a hardware-level "for-loop" issuing\\
185 \textbf{multiple} contiguous scalar operations (instead of just one).
186 \item Implementors are free to implement the "loop" in any fashion\\
187 they see fit. SIMD, Multi-issue, single-execution: anything.
188 \end{itemize}
189 }
190
191 \frame{\frametitle{Other Standard Proposals}
192
193 \begin{itemize}
194 \item Ztrans and Ztrig* - Transcendentals and Trigonometrics\\
195 (optional so that Embedded implementors have some leeway)
196 \item ISAMUX / ISANS - stops arguments over OP32 space\\
197 (also allows clean "paging" of new opcodes into e.g. RVC)
198 \item MV.SWIZZLE and MV.X - RV does not have a MV opcode.
199 \item Zfacc - dynamic FP accuracy. Needed for "fast" Vulkan native\\
200 and to switch between fast 3D accuracy and IEEE754 modes.
201 \item These - and more - need your input! 3D is hard!
202 \item The key strategic premise: these are required as \textbf{public}\\
203 standards, because the \textbf{software} is to be public.
204 \item This is \textbf{not} understood by the RISC-V Foundation.\\
205 ("custom" status not appropriate for high-profile mass-volume\\
206 end-user APIs such as Vulkan).
207 \end{itemize}
208 }
209
210
211 \frame{\frametitle{Summary}
212
213 \begin{itemize}
214 \item 3D is hard (and pure Vectorisation gets you 25\% of \\
215 commercially-acceptable performance).
216 \item Layered optional extensions are going to be key to\\
217 acceptance by a wide variety of 3D Alliance Members.
218 \item With a custom specialised SPIR-V (Vulkan) Compiler\\
219 being an absolutely critical strategic requirement,\\
220 RVV and its associated compiler (still not developed)\\
221 is of marginal value (no clear benefits, extra cost)
222 \item Question everything! Your input, and a willingness to\\
223 take active responsibility for tasks that your Company\\
224 is critically dependent on, are extremely important.
225 \item Public and transparent Collaboration is key. There is simply\\
226 too much to do.
227 \end{itemize}
228 }
229
230
231 \frame{
232 \begin{center}
233 {\Huge \vspace{20pt}
234 The end\vspace{20pt}\\
235 Thank you\vspace{20pt}\\
236 }
237 \end{center}
238
239 \begin{itemize}
240 \item http://lists.libre-riscv.org/pipermail/libre-riscv-dev/
241 \item http://libre-riscv.org/simple\_v\_extension/abridged\_spec/
242 \item https://libre-riscv.org/ztrans\_proposal/
243 \item https://libre-riscv.org/simple\_v\_extension/specification/mv.x/
244 \end{itemize}
245 }
246
247
248 \end{document}