add slides
[libreriscv.git] / shakti / m_class / libre_riscv_chennai_2018.tex
1 \documentclass[slidestop]{beamer}
2 \usepackage{beamerthemesplit}
3 \usepackage{graphics}
4 \usepackage{pstricks}
5
6 \title{Commercial Libre-RISCV SoC}
7 \author{Luke Kenneth Casson Leighton}
8
9
10 \begin{document}
11
12 \frame{
13 \begin{center}
14 \huge{Designing a Commercial Libre RISC-V SoC}\\
15 \vspace{32pt}
16 \Large{Ethical Strategic Leveraging of the benefits}\\
17 \Large{of Libre and Open SW/HW}\\
18 \Large{for pure unadulterated Commercial gain}\\
19 \vspace{24pt}
20 \Large{Chennai 9th RISC-V Workshop}\\
21 \vspace{16pt}
22 \large{\today}
23 \end{center}
24 }
25
26
27 \frame{\frametitle{Credits and Acknowledgements}
28
29 \begin{itemize}
30 \item The Designers of RISC-V\vspace{8pt}
31 \item The RISC-V Foundation\vspace{8pt}
32 \item The Shakti Group, and IIT Madras RISE Group\vspace{8pt}
33 \item Prof. G S Madhusudan\vspace{8pt}
34 \item Neel Gala\vspace{8pt}
35 \item Rishabh Jain\vspace{8pt}
36 \item Members of the RISC-V Open Groups (SW/HW/ISA)\vspace{8pt}
37 \item Libre and Open Software and Hardware Communities
38 \end{itemize}
39 }
40
41
42 \frame{\frametitle{Why, How, What?}
43
44 \begin{itemize}
45 \item Why? Because these days it's just not necessary to
46 make [un]ethical compromises in order to make a profitable,
47 desirable mass-volume product\\
48 {\it (There's enough companies doing that: where it's got us??)}
49 \item How? By leveraging the long-establised strategic cost and
50 maintenance benefits of libre-licensed software (and
51 HDL) and
52 {\it making sure that the people who provide it are
53 financially rewarded}. Also by empowering diverse team
54 collaboration
55 \item What? A 2.5ghz RISC-V 64-bit SoC that has
56 a 3D Embedded GPU, 1080p Video decode, and interfaces
57 to make it attractive for use in tablets, netbooks, industrial
58 embedded and more. 22nm or less, under 400 pins, under USD \$4.\\
59 {\it All sounds obvious... but is it practical and achievable?}
60 \end{itemize}
61 }
62
63
64 \frame{\frametitle{Definitions}
65
66 \begin{itemize}
67 \item {\bf Business}: the provision of a service and being
68 commensurately financially rewarded for doing so
69 \item {\bf Spongeing}: the provision of a service and being
70 taken advantage of for doing so {\it (cf: Professor Yunus)}
71 \item {\bf An ethical act}: an act that increases truth,
72 love, awareness or creativity for one or more people
73 (including yourself), {\it without} reducing those
74 same four qualities {\it for anyone}
75 \item {\bf The Four Freedoms}: the rights and guarantees
76 associated with and embedded within GNU Licenses {\it (cf: FSF)}
77 \end{itemize}
78 {\it Is it possible to ethically do business and respect the
79 Four Freedoms? That's where it gets interesting, as there are
80 even cases where the Four Freedoms are unethical. Note: google's
81 former motto "don't be evil" is clearly (unintentionally) unethical}
82 }
83
84
85 \frame{\frametitle{Does what we want already exist? Surely this is nonsense!}
86 \begin{center}
87 \includegraphics[height=2.4in]{nolibresocs.jpg}\\
88 {\bf Analysis of SoCs over the past 7+ years (answer: no)}
89 \end{center}
90 }
91
92
93 \frame{\frametitle{Breakdown of non-existence of fully-Libre SoCs}
94
95 \begin{itemize}
96 \item {\bf iMX6}: Libre bootable, Vivante 3D GPU (libre etnaviv)
97 but proprietary VPU (and a power-hungry Cortex A9)
98 \item {\bf Allwinner SoCs}: mostly Libre bootable,
99 VPU reverse engineered; GPU: MALI or PowerVR (i.e. proprietary)
100 \item {\bf Rockchip SoCs}: good but using MALI or PowerVR.
101 \item {\bf TI OMAP}: good but using PowerVR. and expensive.
102 \item {\bf Samsung}: good but using MALI.
103 \item {\bf Ingenic jz4775}: GREAT! performance
104 sucks (1ghz MIPS32).
105 \item {\bf Broadcom SoCs}: Cartelled. and boots from the GPU
106 \end{itemize}
107 {\it Basically there does not exist one single commercial SoC that
108 provides full source code for all functions (CPU, GPU, VPU)
109 with modern performance. Which is kinda bizarre if you think about it}
110 }
111
112
113 \frame{\frametitle{What would a good (Libre) boring, mundane SoC have?}
114
115 \begin{itemize}
116 \item Cover a lot of different scenarios (embedded, tablets, industrial,
117 netbooks, crypto-currency mining).
118 \item Decent performance with high efficiency. RISC-V: 40\%
119 more efficient than ARM / Intel. Shakti a good
120 candidate: 2.5ghz and 120mW per core @ 22nm.
121 \item 1080p video: y'all gotta watch cute kittens on youtube, right?
122 \item 3D GPU: y'all gotta play Angri Burds, right? (or Minecraft)
123 \item No spying back-door co-processors (to steal crypto-wallets)
124 \item No Spectres, no Meltdowns.
125 \end{itemize}
126 {\it Basically quite boring and mundane. No Monster Performance,
127 no AI stuff, no special sauce. Just a plain-old SoC,
128 40\% more power efficient than ARM/Intel,
129 and not spying on end-users, that's all}
130 }
131
132
133 \frame{\frametitle{How on earth does an ethical Libre SoC make money???}
134
135 \begin{itemize}
136 \item Simple answer: Mask Rights.
137 \item Without Mask Rights: by having a desirable
138 product, and packaging it for a customer (i.e. by being a middle-man
139 a service is still being provided for which payment etc. etc.)
140 \item Without a desirable product or customer(s): err... you don't.\\
141 (cf: definition of Business)
142 \item By not having high NREs (leveraging back-to-back deals,
143 and helping others fulfil their needs and goals)
144 \end{itemize}
145 {\it Detachment from the goal also helps. If someone else makes this
146 product then GREAT! I can go do something else}\\
147 \vspace{4pt}
148 {\bf Main point: please do not automatically assume Ethical and Libre is
149 non-commercial. It's not nice, and it's not helping }
150
151 }
152
153 \frame{\frametitle{Things wot are "off-limits"}
154
155 \begin{itemize}
156 \item Customer entrapment (through proprietary software).\\
157 Strong business case for not entrapping customers:\\
158 https://tinyurl.com/most-productive-meeting-ever
159 \item Funding, endorsing, supporting or empowering unethical
160 Companies, Organisations, Cartels and Individuals.\\
161 (cf: definition of an ethical act).
162 \item Being totally inflexible / unrealistic. Goals have
163 to be met: it's no good being an idiot about that. e.g. if
164 a Libre 3D GPU really can't be made, use Vivante GC800
165 (with etnaviv).
166 \end{itemize}
167 {\it Still no real show-stoppers to making money (or product):
168 it's just slightly harder, that's all. Ultimately it's about
169 confidence. }
170 }
171
172
173 \frame{\frametitle{Interfaces, Block Diagram, of the Libre-RISCV SoC}
174 \begin{center}
175 \includegraphics[height=2.1in]{../shakti_libre_riscv.jpg}\\
176 {\bf Separate Power Domains for GPIO banks, Variable voltages
177 required, low-power sleep states etc. Quite involved}
178 \end{center}
179 }
180
181
182 \frame{\frametitle{Hardware / Development Complexity Comparison}
183
184 \begin{itemize}
185 \item {\bf Server}: relatively easy. PCIe, RapidIO, XAUI, SATA, GbE, 10GE,
186 DDR3/4 (or HMC) etc. etc. No multiplexing: all interfaces dedicated
187 and high-speed differential pairs.
188 \item {\bf Desktop}: really just a variant of Server.
189 Graphics is a PCIe Card (except if integrated). Peripherals
190 often done in dedicated external ICs ("Southbridge" concept)
191 \item {\bf Embedded}: also pretty easy. Really needs a pinmux. Low clock
192 rate, low power mode. e.g. SiFive Freedom U310.
193 \item {\bf Mobile}: HARD. Performance/Watt matters $=>$ variable core
194 voltage domains {\it per core}. Number of pins matters (affects
195 yield and package cost). Cost
196 matters. Pinmux critical.
197 \end{itemize}
198 {\it Bottom line: Mobile-class processors are challenging!}
199 }
200
201
202 \frame{\frametitle{Proprietary vs Libre-licensed Interface HDL}
203
204 \begin{itemize}
205 \item DDR3/4: challenging! \$1m for single-use, single instance.\\
206 Symbiotic EDA: \$600k for PHY; CERN developed a Controller\\
207 http://libre-riscv.org/shakti/m\_class/DDR/
208 \item HyperRAM (JEDEC xSPI): lower risk than DDR3/4\\
209 http://libre-riscv.org/shakti/m\_class/HyperRAM/
210 \item RGMII: several available (saves \$50k)\\
211 http://libre-riscv.org/shakti/m\_class/RGMII/
212 \item UART, SPI, I2C, PWM, SD/MMC: all libre (except eMMC).
213 \item Shakti Group has FlexBus, QuadSPI, SRAM, many more.
214 \item RGB/TTL: R. Herveille (SSD2828, SN75LVDS83b, TFP410a)
215 \end{itemize}
216 {\it Basically there's no compelling reason to spend vast sums
217 on proprietary HDL. Sorry Cadence / Mentor / Synopsis / whoever}
218 }
219
220
221 \frame{\frametitle{Challenging Stuff [1] - Memory Interfaces}
222
223 \begin{itemize}
224 \item DDR3/4 PHYs are analog and very high speed.
225 Impedance training. Extreme timing tolerances on parallel buses.\\
226 No surprise proprietary cost is USD \$1m and above.
227 \item Symbiotic EDA will do (Libre) PHY layout for USD \$300k,
228 time to completion for chosen geometry: 8-12 months.
229 \end{itemize}
230 {\it Silicon-proven but still risky. What are the alternatives?}
231 \vspace{4pt}
232 \begin{itemize}
233 \item 133mhz 32-bit SDRAM (um...) maybe even FlexBus?
234 \item HyperRAM (aka JEDEC xSPI) 8-bit SPI 166mhz or DDR-300.\\
235 300mbyte/sec for only 13 wires, not bad! (We'll take several)\\
236 http://libre-riscv.org/shakti/m\_class/HyperRAM/
237 \item HMC: insanely fast, very low power. OpenHMC (LGPL)
238 https://opencores.org/project/openhmc
239 \end{itemize}
240 }
241
242
243 \frame{\frametitle{Challenging Stuff [2] - Video Decode Engine}
244
245 \begin{itemize}
246 \item Richard Herveille's Video Core Blocks\\
247 https://opencores.org/project/video\_systems
248 \item Symbiotic EDA MP4 decoder in FPGA
249 \item H.264 seems to have been done...\\
250 https://github.com/adsc-hls/synthesizable\_h264
251 \item Really needs SIMD (or better, not-SIMD)\\
252 {http://libre-riscv.org/simple\_v\_extension/}
253 \item Definitely needs xBitManip (parallelised by Simple-V)\\
254 https://github.com/cliffordwolf/xbitmanip
255 \end{itemize}
256 {\it SIMD is insane. $O(N^6)$ opcode proliferation. See\\
257 https://www.sigarch.org/simd-instructions-considered-harmful/ \\
258 (1): P-Ext designed for Audio. (2): Investigate RI5CY's SIMD
259 }
260 }
261
262
263 \frame{\frametitle{Challenging Stuff [3] - Power Management}
264
265 \begin{itemize}
266 \item Been done before (many times), but not as a Libre Design.
267 \item Sanjay Charagulla: GlobalFoundries 22nm mobile process
268 can reach as low as 0.4v
269 \item GPIO Banks need per-bank VREF (1.8v? to 3.3v)\\
270 IO pads need built-in
271 level-shifting to convert to CPU VCORE
272 \item Each core needs independent variable-voltage capability
273 and independent shut-down (PMIC supplies external voltage)
274 \item DDR RAM still needs refreshing (even in sleep mode)
275 \item Extra RV32 (PicoRV32?) always-on core for wake-up / RTC?
276 \item PLLs are Analog. fun fun fun in the sun sun sun...
277 \end{itemize}
278 {\it Really need help. PLLs, Analog stuff: specific
279 domain expertise. Fall-back example:
280 https://www.dolphin-integration.com?
281 }
282 }
283
284
285 \frame{\frametitle{Challenging Stuff [4] - Libre 3D GPU. Sigh.}
286
287 \begin{itemize}
288 \item Actual requirements quite modest: 30MP/s 100MT/s 5GFLOPS
289 but power/area is crucial ($2mm^2$ @ 40nm)
290 \item Nyuzi, MIAOW, GPLGPU (Number Nine), OGP.
291 \item Nyuzi based on Larrabee. Jeff Bush really helpful.
292 \item MIAOW is an OpenCL engine. GPLGPU is fixed-function
293 \item Nyuzi lessons: Software-only rendering not enough.
294 Getting through L1 cache takes most power. Fixed functions
295 such as parallel FP-Quad to ARGB Pixel, and Z-Buffer
296 needed.
297 \item Fallback is GC800 (\$250k) {\it contact me if you can do better!}
298 \end{itemize}
299 {\it Jacob Bachmeyer's Cache-control proposal turns L1 Cache into
300 scratchpad RAM. RVV is just too heavy (sorry!), Simple-V much
301 more light-weight and flexible ($O(1)$ ISA proliferation)
302 }
303 }
304
305
306 \frame{\frametitle{Challenging Stuff [5] - Custom Extensions}
307
308 \begin{itemize}
309 \item GPUs are usually done with incompatible ISAs and effectively
310 doing OpenGL over IPC / RPC (Remote Procedure Calls)
311 \item Much simpler: GPGPU approach. Custom-extend the
312 main core ISA to handle 3D, and accelerate
313 Gallium3D-LLVM.
314 \item Now add Video Extensions. and SIMD. and, and, and...\\
315 {\bf we are well beyond the 2 32-bit custom opcodes}
316 \item Due to the Libre nature of this project, the custom opcode
317 space will be "dominated" by
318 high-profile public hard-forks of gcc, binutils, llvm etc.
319 Which isn't going to go down well.
320 \item Instruction-set "Conflict Resolution" is therefore critical\\
321 http://libre-riscv.org/isa\_conflict\_resolution/
322 \end{itemize}
323 {\it Remember Altivec. Learn from Intel.
324 \underline{This is everyone's problem.}
325 }
326 }
327
328
329 \frame{\frametitle{TODO}
330
331 \begin{itemize}
332 \item TODO\vspace{8pt}
333 \end{itemize}
334 }
335
336
337 \frame{\frametitle{Summary}
338
339 \begin{itemize}
340 \item Making a commercially-desirable SoC is neither academically
341 nor standard-investor sexy! No AI. Boring. zzzz
342 \item Luckily there is an anonymous sponsor who needs an SoC that
343 doesn't exist (who knows the commercial benefits of Libre)
344 \item Shakti Group know the benefits (cost, sovereignty) of a Libre
345 Mobile-Class SoC as well (No spying on India citizens!)
346 \item A Libre GPU, even a modest performer (100T/s etc.)
347 is the biggest technical risk/unknown (besides DDR3/4).\\
348 (fall-back is GC800. Do please help with a Libre GPU!)
349 \item DDR3/4 and eMMC are the main high-risk interfaces\\
350 (there are fall-back strategies in place)
351 \item Ultimately the strategy is all about cost reduction
352 vs risk mitigation,
353 with Libre/Ethical prioritised over "convenience"
354 \end{itemize}
355 }
356
357
358 \frame{
359 \begin{center}
360 {\Huge The end\vspace{20pt}\\
361 Thank you\vspace{20pt}\\
362 Questions?\vspace{20pt}
363 }
364 \end{center}
365
366 \begin{itemize}
367 \item Contact: lkcl@lkcl.net
368 \item http://libre-riscv.org/shakti/m\_class/
369 \end{itemize}
370 }
371
372
373 \end{document}