add slides
[libreriscv.git] / shakti / m_class / libre_riscv_chennai_2018.tex
1 \documentclass[slidestop]{beamer}
2 \usepackage{beamerthemesplit}
3 \usepackage{graphics}
4 \usepackage{pstricks}
5
6 \title{Commercial Libre-RISCV SoC}
7 \author{Luke Kenneth Casson Leighton}
8
9
10 \begin{document}
11
12 \frame{
13 \begin{center}
14 \huge{Designing a Commercial Libre RISC-V SoC}\\
15 \vspace{32pt}
16 \Large{Ethical Strategic Leveraging of the benefits}\\
17 \Large{of Libre and Open SW/HW}\\
18 \Large{for pure unadulterated Commercial gain}\\
19 \vspace{24pt}
20 \Large{Chennai 9th RISC-V Workshop}\\
21 \vspace{16pt}
22 \large{\today}
23 \end{center}
24 }
25
26
27 \frame{\frametitle{Credits and Acknowledgements}
28
29 \begin{itemize}
30 \item The Designers of RISC-V
31 \item The RISC-V Foundation
32 \item The Shakti Group, and IIT Madras RISE Group
33 \item Prof. G S Madhusudan
34 \item Neel Gala
35 \item Rishabh Jain
36 \item Members of the RISC-V Open Groups (SW/HW/ISA)
37 \item Libre and Open Software and Hardware Communities
38 \item Richard Herveille (RoaLogic), Edmund Humenberger, Clifford Wolf
39 (Symbiotica EDA), Rudi (Asics.ws),
40 Alex Forenchich, LowRISC Team
41 \item Anonymous Sponsor
42 \end{itemize}
43 }
44
45
46 \frame{\frametitle{Why, How, What?}
47
48 \begin{itemize}
49 \item Why? Because these days it's just not necessary to
50 make [un]ethical compromises in order to make a profitable,
51 desirable mass-volume product\\
52 {\it (There's enough companies doing that: where it's got us??)}
53 \item How? By leveraging the long-establised strategic cost and
54 maintenance benefits of libre-licensed software (and
55 HDL) and
56 {\it making sure that the people who provide it are
57 financially rewarded}. Also by empowering diverse team
58 collaboration
59 \item What? A 2.5ghz RISC-V 64-bit SoC that has
60 a 3D Embedded GPU, 1080p Video decode, and interfaces
61 to make it attractive for use in tablets, netbooks, industrial
62 embedded and more. 22nm or less, under 400 pins, under USD \$4.\\
63 {\it All sounds obvious... but is it practical and achievable?}
64 \end{itemize}
65 }
66
67
68 \frame{\frametitle{Definitions}
69
70 \begin{itemize}
71 \item {\bf Business}: the provision of a service and being
72 commensurately financially rewarded for doing so
73 \item {\bf Spongeing}: the provision of a service and being
74 taken advantage of for doing so {\it (cf: Professor Yunus)}
75 \item {\bf An ethical act}: an act that increases truth,
76 love, awareness or creativity for one or more people
77 (including yourself), {\it without} reducing those
78 same four qualities {\it for anyone}
79 \item {\bf The Four Freedoms}: the rights and guarantees
80 associated with and embedded within GNU Licenses {\it (cf: FSF)}
81 \end{itemize}
82 {\it Is it possible to ethically do business and respect the
83 Four Freedoms? That's where it gets interesting, as there are
84 even cases where the Four Freedoms are unethical. Note: google's
85 former motto "don't be evil" is clearly (unintentionally) unethical}
86 }
87
88
89 \frame{\frametitle{Does what we want already exist? Surely this is nonsense!}
90 \begin{center}
91 \includegraphics[height=2.4in]{nolibresocs.jpg}\\
92 {\bf Analysis of SoCs over the past 7+ years (answer: no)}
93 \end{center}
94 }
95
96
97 \frame{\frametitle{Breakdown of non-existence of fully-Libre SoCs}
98
99 \begin{itemize}
100 \item {\bf iMX6}: Libre bootable, Vivante 3D GPU (libre etnaviv)
101 but proprietary VPU (and a power-hungry Cortex A9)
102 \item {\bf Allwinner SoCs}: mostly Libre bootable,
103 VPU reverse engineered; GPU: MALI or PowerVR (i.e. proprietary)
104 \item {\bf Rockchip SoCs}: good but using MALI or PowerVR.
105 \item {\bf TI OMAP}: good but using PowerVR. and expensive.
106 \item {\bf Samsung}: good but using MALI.
107 \item {\bf Ingenic jz4775}: GREAT! performance
108 sucks (1ghz MIPS32).
109 \item {\bf Broadcom SoCs}: Cartelled. and boots from the GPU
110 \end{itemize}
111 {\it Basically there does not exist one single commercial SoC that
112 provides full source code for all functions (CPU, GPU, VPU)
113 with modern performance. Which is kinda bizarre if you think about it}
114 }
115
116
117 \frame{\frametitle{What would a good (Libre) boring, mundane SoC have?}
118
119 \begin{itemize}
120 \item Cover a lot of different scenarios (embedded, tablets, industrial,
121 netbooks, crypto-currency mining).
122 \item Decent performance with high efficiency. RISC-V: 40\%
123 more efficient than ARM / Intel. Shakti a good
124 candidate: 2.5ghz and 120mW per core @ 22nm.
125 \item 1080p video: y'all gotta watch cute kittens on youtube, right?
126 \item 3D GPU: y'all gotta play Angri Burds, right? (or Minecraft)
127 \item No spying back-door co-processors (to steal crypto-wallets)
128 \item No Spectres, no Meltdowns.
129 \end{itemize}
130 {\it Basically quite boring and mundane. No Monster Performance,
131 no AI stuff, no special sauce. Just a plain-old SoC,
132 40\% more power efficient than ARM/Intel,
133 and not spying on end-users, that's all}
134 }
135
136
137 \frame{\frametitle{How on earth does an ethical Libre SoC make money???}
138
139 \begin{itemize}
140 \item Simple answer: Mask Rights.
141 \item Without Mask Rights: by having a desirable
142 product, and packaging it for a customer (i.e. by being a middle-man
143 a service is still being provided for which payment etc. etc.)
144 \item Without a desirable product or customer(s): err... you don't.\\
145 (cf: definition of Business)
146 \item By not having high NREs (leveraging back-to-back deals,
147 and helping others fulfil their needs and goals)
148 \end{itemize}
149 {\it Detachment from the goal also helps. If someone else makes this
150 product then GREAT! I can go do something else}\\
151 \vspace{4pt}
152 {\bf Main point: please do not automatically assume Ethical and Libre is
153 non-commercial. It's not nice, and it's not helping }
154
155 }
156
157 \frame{\frametitle{Things wot are "off-limits"}
158
159 \begin{itemize}
160 \item Customer entrapment (through proprietary software).\\
161 Strong business case for not entrapping customers:\\
162 https://tinyurl.com/most-productive-meeting-ever
163 \item Funding, endorsing, supporting or empowering unethical
164 Companies, Organisations, Cartels and Individuals.\\
165 (cf: definition of an ethical act).
166 \item Being totally inflexible / unrealistic. Goals have
167 to be met: it's no good being an idiot about that. e.g. if
168 a Libre 3D GPU really can't be made, use Vivante GC800
169 (with etnaviv).
170 \end{itemize}
171 {\it Still no real show-stoppers to making money (or product):
172 it's just slightly harder, that's all. Ultimately it's about
173 confidence. }
174 }
175
176
177 \frame{\frametitle{Interfaces, Block Diagram, of the Libre-RISCV SoC}
178 \begin{center}
179 \includegraphics[height=2.1in]{../shakti_libre_riscv.jpg}\\
180 {\bf Separate Power Domains for GPIO banks, Variable voltages
181 required, low-power sleep states etc. Quite involved}
182 \end{center}
183 }
184
185
186 \frame{\frametitle{Hardware / Development Complexity Comparison}
187
188 \begin{itemize}
189 \item {\bf Server}: relatively easy. PCIe, RapidIO, XAUI, SATA, GbE, 10GE,
190 DDR3/4 (or HMC) etc. etc. No multiplexing: all interfaces dedicated
191 and high-speed differential pairs.
192 \item {\bf Desktop}: really just a variant of Server.
193 Graphics is a PCIe Card (except if integrated). Peripherals
194 often done in dedicated external ICs ("Southbridge" concept)
195 \item {\bf Embedded}: also pretty easy. Really needs a pinmux. Low clock
196 rate, low power mode. e.g. SiFive Freedom U310.
197 \item {\bf Mobile}: HARD. Performance/Watt matters $=>$ variable core
198 voltage domains {\it per core}. Number of pins matters (affects
199 yield and package cost). Cost
200 matters. Pinmux critical.
201 \end{itemize}
202 {\it Bottom line: Mobile-class processors are challenging!}
203 }
204
205
206 \frame{\frametitle{Proprietary vs Libre-licensed Interface HDL}
207
208 \begin{itemize}
209 \item DDR3/4: challenging! \$1m for single-use, single instance.\\
210 Symbiotic EDA: \$600k for PHY; CERN developed a Controller\\
211 http://libre-riscv.org/shakti/m\_class/DDR/
212 \item HyperRAM (JEDEC xSPI): lower risk than DDR3/4\\
213 http://libre-riscv.org/shakti/m\_class/HyperRAM/
214 \item RGMII: several available (saves \$50k)\\
215 http://libre-riscv.org/shakti/m\_class/RGMII/
216 \item UART, SPI, I2C, PWM, SD/MMC: all libre (except eMMC).
217 \item Shakti Group has FlexBus, QuadSPI, SRAM, many more.
218 \item RGB/TTL: R. Herveille (SSD2828, SN75LVDS83b, TFP410a)
219 \end{itemize}
220 {\it Basically there's no compelling reason to spend vast sums
221 on proprietary HDL. Sorry Cadence / Mentor / Synopsis / whoever}
222 }
223
224
225 \frame{\frametitle{Challenging Stuff [1] - Memory Interfaces}
226
227 \begin{itemize}
228 \item DDR3/4 PHYs are analog and very high speed.
229 Impedance training. Extreme timing tolerances on parallel buses.\\
230 No surprise proprietary cost is USD \$1m and above.
231 \item Symbiotic EDA will do (Libre) PHY layout for USD \$300k,
232 time to completion for chosen geometry: 8-12 months.
233 \end{itemize}
234 {\it Silicon-proven but still risky. What are the alternatives?}
235 \vspace{4pt}
236 \begin{itemize}
237 \item 133mhz 32-bit SDRAM (um...) maybe even FlexBus?
238 \item HyperRAM (aka JEDEC xSPI) 8-bit SPI 166mhz or DDR-300.\\
239 300mbyte/sec for only 13 wires, not bad! (We'll take several)\\
240 http://libre-riscv.org/shakti/m\_class/HyperRAM/
241 \item HMC: insanely fast, very low power. OpenHMC (LGPL)
242 https://opencores.org/project/openhmc
243 \end{itemize}
244 }
245
246
247 \frame{\frametitle{Challenging Stuff [2] - Video Decode Engine}
248
249 \begin{itemize}
250 \item Richard Herveille's Video Core Blocks\\
251 https://opencores.org/project/video\_systems
252 \item Symbiotic EDA MP4 decoder in FPGA
253 \item H.264 seems to have been done...\\
254 https://github.com/adsc-hls/synthesizable\_h264
255 \item Really needs SIMD (or better, not-SIMD)\\
256 {http://libre-riscv.org/simple\_v\_extension/}
257 \item Definitely needs xBitManip (parallelised by Simple-V)\\
258 https://github.com/cliffordwolf/xbitmanip
259 \end{itemize}
260 {\it SIMD is insane. $O(N^6)$ opcode proliferation. See\\
261 https://www.sigarch.org/simd-instructions-considered-harmful/ \\
262 (1): P-Ext designed for Audio. (2): Investigate RI5CY's SIMD
263 }
264 }
265
266
267 \frame{\frametitle{Challenging Stuff [3] - Power Management}
268
269 \begin{itemize}
270 \item Been done before (many times), but not as a Libre Design.
271 \item Sanjay Charagulla: GlobalFoundries 22nm mobile process
272 can reach as low as 0.4v
273 \item GPIO Banks need per-bank VREF (1.8v? to 3.3v)\\
274 IO pads need built-in
275 level-shifting to convert to CPU VCORE
276 \item Each core needs independent variable-voltage capability
277 and independent shut-down (PMIC supplies external voltage)
278 \item DDR RAM still needs refreshing (even in sleep mode)
279 \item Extra RV32 (PicoRV32?) always-on core for wake-up / RTC?
280 \item PLLs are Analog. fun fun fun in the sun sun sun...
281 \end{itemize}
282 {\it Really need help. PLLs, Analog stuff: specific
283 domain expertise. Fall-back example:
284 https://www.dolphin-integration.com?
285 }
286 }
287
288
289 \frame{\frametitle{Challenging Stuff [4] - Libre 3D GPU. Sigh.}
290
291 \begin{itemize}
292 \item Actual requirements quite modest: 30MP/s 100MT/s 5GFLOPS
293 but power/area is crucial ($2mm^2$ @ 40nm)
294 \item Nyuzi, MIAOW, GPLGPU (Number Nine), OGP.
295 \item Nyuzi based on Larrabee. Jeff Bush really helpful.
296 \item MIAOW is an OpenCL engine. GPLGPU is fixed-function
297 \item Nyuzi lessons: Software-only rendering not enough.
298 Getting through L1 cache takes most power. Fixed functions
299 such as parallel FP-Quad to ARGB Pixel, and Z-Buffer
300 needed.
301 \item Fallback is GC800 (\$250k) {\it contact me if you can do better!}
302 \end{itemize}
303 {\it Jacob Bachmeyer's Cache-control proposal turns L1 Cache into
304 scratchpad RAM. RVV is just too heavy (sorry!), Simple-V much
305 more light-weight and flexible ($O(1)$ ISA proliferation)
306 }
307 }
308
309
310 \frame{\frametitle{Challenging Stuff [5] - Custom Extensions}
311
312 \begin{itemize}
313 \item GPUs are usually done with incompatible ISAs and effectively
314 doing OpenGL over IPC / RPC (Remote Procedure Calls)
315 \item Much simpler: GPGPU "one ISA" approach. Custom-extend the
316 core ISA to handle 3D, use Gallium3D-LLVM.
317 \item Now add Video Extensions. and SIMD etc and
318 {\bf we are well beyond the only 2 available 32-bit custom opcodes}
319 \item Due to the Libre nature of this project, the custom opcode
320 space will be "dominated" by
321 high-profile public hard-forks of gcc, binutils, llvm etc.
322 Which isn't going to go down well.
323 \item ISA "Conflict Resolution" is therefore absolutely critical\\
324 http://libre-riscv.org/isa\_conflict\_resolution/
325 \end{itemize}
326 {\it Remember Altivec. Learn from Intel.
327 \underline{This is everyone's problem.}
328 }
329 }
330
331
332 \frame{\frametitle{Interesting Missing Stuff [1] - Pinmux}
333
334 \begin{itemize}
335 \item Pinmux: multiplexer of functions onto pins\\
336 {\it DRAM Cell != DDR3/4, Mux Cell != Muxer}
337 \item Strategically extremely important to Commercial SoC success\\
338 STMicro, Rockchip, Freescale, Samsung, {\bf EVERYONE}
339 \item Bizarrely, a libre-licensed multi-way Pinmux doesn't exist.\\
340 {\it not on anyone's radar. at all.}
341 SiFive IOF not enough.
342 \item Verification (scenario analysis) and auto-generation of
343 TRM, header files, device-tree files, pretty much everything
344 makes sense (to any "lazy" Software Engineer...)
345 \item Corporations with their own pinmux unlikely to be interested.
346 \item http://git.libre-riscv.org/?p=pinmux.git \\
347 http://hands.com/~lkcl/pinmux\_chennai\_2018.pdf
348 \end{itemize}
349 }
350
351
352 \frame{\frametitle{Interesting Missing Stuff [2] - AC97/I2S, USB2 PHY}
353
354
355 \begin{itemize}
356 \item Rudi (Asics.ws) donating time to create a Multi-Protocol
357 Audio Controller: AC97, PCM, PDM, I2S\\
358 http://libre-riscv.org/shakti/m\_class/AC97/
359 \item USB2 is... convoluted. UTMI-ULPI-USB2 PHY\\
360 USB2-PHY not confirmed (Rudi has one)\\
361 Also Rudi has DDR (8-pin) variant of ULPI
362 http://libre-riscv.org/shakti/m\_class/ULPI/
363 \item USB3 not necessarily a good idea to put into Libre-RISCV\\
364 Daisho USB3 Pipe exists, TUSB1310a PHY is 175 pin FBGA!
365 \item Libre SD/MMC typically at "Open" Level 20MB/sec appx.
366 Full spec and eMMC requires membership.
367 \end{itemize}
368 }
369
370
371 \frame{\frametitle{TODO}
372
373 \begin{itemize}
374 \item TODO\vspace{8pt}
375 \end{itemize}
376 }
377
378
379 \frame{\frametitle{Summary}
380
381 \begin{itemize}
382 \item Making a commercially-desirable SoC is neither academically
383 nor standard-investor sexy! No AI. Boring. zzzz
384 \item Luckily there is an anonymous sponsor who needs an SoC that
385 doesn't exist (who knows the commercial benefits of Libre)
386 \item Shakti Group know the benefits (cost, sovereignty) of a Libre
387 Mobile-Class SoC as well (No spying on India citizens!)
388 \item A Libre GPU, even a modest performer (100T/s etc.)
389 is the biggest technical risk/unknown, besides DDR3/4.\\
390 (fall-back is GC800. Do please help with a Libre GPU!)
391 \item DDR3/4 and eMMC are the main high-risk interfaces\\
392 (there are fall-back strategies in place)
393 \item Ultimately the strategy is all about cost reduction
394 vs risk mitigation,
395 with Libre/Ethical prioritised over "convenience"
396 \end{itemize}
397 }
398
399
400 \frame{
401 \begin{center}
402 {\Huge The end\vspace{20pt}\\
403 Thank you\vspace{20pt}\\
404 Questions?\vspace{20pt}
405 }
406 \end{center}
407
408 \begin{itemize}
409 \item Contact: lkcl@lkcl.net
410 \item http://libre-riscv.org/shakti/m\_class/
411 \end{itemize}
412 }
413
414
415 \end{document}