607d3f9963970986a8eec0fb84eda3ec6d6d4d69
[libreriscv.git] / shakti / m_class / libre_riscv_chennai_2018.tex
1 \documentclass[slidestop]{beamer}
2 \usepackage{beamerthemesplit}
3 \usepackage{graphics}
4 \usepackage{pstricks}
5
6 \title{Commercial Libre-RISCV SoC}
7 \author{Luke Kenneth Casson Leighton}
8
9
10 \begin{document}
11
12 \frame{
13 \begin{center}
14 \huge{Designing a Commercial Libre RISC-V SoC}\\
15 \vspace{32pt}
16 \Large{Ethical Strategic Leveraging of the benefits}\\
17 \Large{of Libre and Open SW/HW}\\
18 \Large{for pure unadulterated Commercial gain}\\
19 \vspace{24pt}
20 \Large{Chennai 9th RISC-V Workshop}\\
21 \vspace{16pt}
22 \large{\today}
23 \end{center}
24 }
25
26
27 \frame{\frametitle{Credits and Acknowledgements}
28
29 \begin{itemize}
30 \item The Designers of RISC-V\vspace{15pt}
31 \item The Shakti Group\vspace{15pt}
32 \item Prof. G S Madhusudan\vspace{15pt}
33 \item Neel Gala\vspace{15pt}
34 \item Rishabh Jain\vspace{15pt}
35 \end{itemize}
36 }
37
38
39 \frame{\frametitle{Why, How, What?}
40
41 \begin{itemize}
42 \item Why? Because these days it's just not necessary to
43 make [un]ethical compromises in order to make a profitable,
44 desirable mass-volume product\\
45 {\it (There's enough companies doing that: where it's got us??)}
46 \item How? By leveraging the long-establised strategic cost and
47 maintenance benefits of libre-licensed software (and
48 HDL) and
49 {\it making sure that the people who provide it are
50 financially rewarded}. Also by empowering diverse team
51 collaboration
52 \item What? A 2.5ghz RISC-V 64-bit SoC that has
53 a 3D Embedded GPU, 1080p Video decode, and interfaces
54 to make it attractive for use in tablets, netbooks, industrial
55 embedded and more. 22nm or less, under 400 pins, under USD \$4.\\
56 {\it All sounds obvious... but is it practical and achievable?}
57 \end{itemize}
58 }
59
60
61 \frame{\frametitle{Definitions}
62
63 \begin{itemize}
64 \item {\bf Business}: the provision of a service and being
65 commensurately financially rewarded for doing so
66 \item {\bf Spongeing}: the provision of a service and being
67 taken advantage of for doing so {\it (cf: Professor Yunus)}
68 \item {\bf An ethical act}: an act that increases truth,
69 love, awareness or creativity for one or more people
70 (including yourself), {\it without} reducing those
71 same four qualities {\it for anyone}
72 \item {\bf The Four Freedoms}: the rights and guarantees
73 associated with and embedded within GNU Licenses {\it (cf: FSF)}
74 \end{itemize}
75 {\it Is it possible to ethically do business and respect the
76 Four Freedoms? That's where it gets interesting, as there are
77 even cases where the Four Freedoms are unethical. Note: google's
78 former motto "don't be evil" is clearly (unintentionally) unethical}
79 }
80
81
82 \frame{\frametitle{Does what we want already exist?}
83 \begin{center}
84 \includegraphics[height=2.4in]{nolibresocs.jpg}\\
85 {\bf Analysis of SoCs over the past 7+ years (answer: no)}
86 \end{center}
87 }
88
89
90 \frame{\frametitle{What's the problem?}
91
92 \begin{itemize}
93 \item {\bf iMX6}: Libre bootable, Vivante 3D GPU (libre etnaviv)
94 but proprietary VPU (and a power-hungry Cortex A9)
95 \item {\bf Allwinner SoCs}: mostly Libre bootable,
96 VPU reverse engineered; GPU: MALI or PowerVR (i.e. proprietary)
97 \item {\bf Rockchip SoCs}: good but using MALI or PowerVR.
98 \item {\bf TI OMAP}: good but using PowerVR. and expensive.
99 \item {\bf Samsung}: good but using MALI.
100 \item {\bf Ingenic jz4775}: GREAT! performance
101 sucks (1ghz MIPS32).
102 \item {\bf Broadcom SoCs}: Cartelled. and boots from the GPU
103 \end{itemize}
104 {\it Basically there does not exist one single commercial SoC that
105 provides full source code for all functions (CPU, GPU, VPU)
106 with modern performance. Which is kinda bizarre if you think about it}
107 }
108
109
110 \frame{\frametitle{What would a good (Libre) boring, mundane SoC have?}
111
112 \begin{itemize}
113 \item Cover a lot of different scenarios (embedded, tablets, industrial,
114 netbooks, crypto-currency mining).
115 \item Decent performance with high efficiency. RISC-V: 40 \%
116 more efficient than ARM / Intel. Shakti a good
117 candidate: 2.5ghz and 120mW per core @ 22nm.
118 \item 1080p video: y'all gotta watch cute kittens on youtube, right?
119 \item 3D GPU: y'all gotta play Angri Burds, right? (or Minecraft)
120 \item No spying back-door co-processors (to steal crypto-wallets)
121 \item No Spectres, no Meltdowns.
122 \end{itemize}
123 {\it Basically quite boring and mundane. No Monster Performance,
124 no AI stuff, no special sauce. Just a plain-old SoC,
125 40\% more power efficient than ARM/Intel,
126 and not spying on end-users, that's all}
127 }
128
129
130 \frame{\frametitle{How on earth does an ethical Libre SoC make money???}
131
132 \begin{itemize}
133 \item Simple answer: Mask Rights.
134 \item Without Mask Rights: by having a desirable
135 product, and packaging it for a customer (i.e. by being a middle-man
136 a service is still being provided for which payment etc. etc.)
137 \item Without a desirable product or customer(s): err... you don't.\\
138 (cf: definition of Business)
139 \item By not having high NREs (leveraging back-to-back deals,
140 and helping others fulfil their needs and goals)
141 \end{itemize}
142 {\it Detachment from the goal also helps. If someone else makes this
143 product then GREAT! I can go do something else}
144
145 }
146
147 \frame{\frametitle{Things wot are "off-limits"}
148
149 \begin{itemize}
150 \item Customer entrapment (through proprietary software).\\
151 Strong business case for not entrapping customers:\\
152 https://tinyurl.com/most-productive-meeting-ever
153 \item Funding, endorsing, supporting or empowering unethical
154 Companies, Organisations, Cartels and Individuals.\\
155 (cf: definition of an ethical act).
156 \item Being totally inflexible / unrealistic. Goals have
157 to be met: it's no good being an idiot about that. e.g. if
158 a Libre 3D GPU really can't be made, use Vivante GC800
159 (with etnaviv).
160 \end{itemize}
161 {\it Still no real show-stoppers to making money (or product):
162 it's just slightly harder, that's all. Ultimately it's about
163 confidence. }
164 }
165
166
167 \frame{\frametitle{Interfaces, Block Diagram, of the Libre-RISCV SoC}
168 \begin{center}
169 \includegraphics[height=2.1in]{../shakti_libre_riscv.jpg}\\
170 {\bf Separate Power Domains for GPIO banks, Variable voltages
171 required, low-power sleep states etc. Quite involved}
172 \end{center}
173 }
174
175
176 \frame{\frametitle{Hardware / Development Complexity Comparison}
177
178 \begin{itemize}
179 \item {\bf Server}: relatively easy. PCIe, RapidIO, XAUI, SATA, GbE, 10GE,
180 DDR3/4 (or HMC) etc. etc. No multiplexing: all interfaces dedicated
181 and high-speed differential pairs.
182 \item {\bf Desktop}: really just a variant of Server.
183 Graphics is a PCIe Card (except if integrated). Peripherals
184 often done in dedicated external ICs ("Southbridge" concept)
185 \item {\bf Embedded}: also pretty easy. Really needs a pinmux. Low clock
186 rate, low power mode. e.g. SiFive Freedom U310.
187 \item {\bf Mobile}: HARD. Performance/Watt matters $=>$ variable core
188 voltage domains {\it per core}. Number of pins matters (affects
189 yield and package cost). Cost
190 matters. Pinmux critical.
191 \end{itemize}
192 {\it Bottom line: Mobile-class processors are challenging!}
193 }
194
195
196 \frame{\frametitle{Proprietary vs Libre-licensed Interface HDL}
197
198 \begin{itemize}
199 \item DDR3/4: challenging! \$1m for single-use, single instance.\\
200 Symbiotic EDA: \$600k for PHY; CERN developed a Controller\\
201 http://libre-riscv.org/shakti/m\_class/DDR/
202 \item HyperRAM (JEDEC xSPI): lower risk than DDR3/4\\
203 http://libre-riscv.org/shakti/m\_class/HyperRAM/
204 \item RGMII: several available (saves \$50k)\\
205 http://libre-riscv.org/shakti/m\_class/RGMII/
206 \item UART, SPI, I2C, PWM, SD/MMC: all libre (except eMMC).
207 \item Shakti Group has FlexBus, QuadSPI, SRAM, many more.
208 \item RGB/TTL: R. Herveille (SSD2828, SN75LVDS83b, TFP410a)
209 \end{itemize}
210 {\it Basically there's no compelling reason to spend vast sums
211 on proprietary HDL. Sorry Cadence / Mentor / Synopsis / whoever}
212 }
213
214
215 \frame{\frametitle{Challenging Stuff [1] - Memory Interfaces}
216
217 \begin{itemize}
218 \item DDR3/4 PHYs are analog and very high speed.
219 Impedance training. Extreme timing tolerances on parallel buses.\\
220 No surprise they cost USD \$1m and above.
221 \item Symbiotic EDA will do (Libre) PHY layout for USD \$300k,
222 time to completion for chosen geometry: 8-12 months.
223 \end{itemize}
224 {\it Silicon-proven but still risky. What are the alternatives?}
225 \vspace{4pt}
226 \begin{itemize}
227 \item 133mhz 32-bit SDRAM (um...) maybe even FlexBus?
228 \item HyperRAM (aka JEDEC xSPI) 8-bit SPI 166mhz or DDR-300.\\
229 300mbyte/sec for only 13 wires, not bad! (We'll take several)\\
230 http://libre-riscv.org/shakti/m\_class/HyperRAM/
231 \item HMC: insanely fast, very low power. OpenHMC (LGPL)
232 https://opencores.org/project/openhmc
233 \end{itemize}
234 }
235
236
237 \frame{\frametitle{Challenging Stuff [2] - Video Decode Engine}
238
239 \begin{itemize}
240 \item Richard Herveille's Video Core Blocks\\
241 https://opencores.org/project/video\_systems
242 \item Symbiotic EDA MP4 decoder in FPGA
243 \item H.264 seems to have been done...\\
244 https://github.com/adsc-hls/synthesizable\_h264
245 \item Really needs SIMD (or better, not-SIMD)\\
246 {http://libre-riscv.org/simple\_v\_extension/}
247 \item Definitely needs xBitManip (parallelised by Simple-V)\\
248 https://github.com/cliffordwolf/xbitmanip
249 \end{itemize}
250 {\it SIMD is insane. $O(N^6)$ opcode proliferation. See\\
251 https://www.sigarch.org/simd-instructions-considered-harmful/ \\
252 (1): P-Ext designed for Audio. (2): Investigate RI5CY's SIMD
253 }
254 }
255
256
257 \frame{\frametitle{Challenging Stuff [3] - 3D GPU. Sigh.}
258
259 \begin{itemize}
260 \item Actual requirements quite modest: 30MP/s 100MT/s 5GFLOPS
261 but power/area is crucial ($2mm^2$ @ 40nm)
262 \item Nyuzi, MIAOW, GPLGPU (Number Nine), OGP.
263 \item Nyuzi based on Larrabee. Jeff Bush really helpful.
264 \item MIAOW is an OpenCL engine. GPLGPU is fixed-function
265 \item Nyuzi lessons: Software-only rendering not enough.
266 Getting through L1 cache takes most power. Fixed functions
267 such as parallel FP-Quad to ARGB Pixel, and Z-Buffer
268 needed.
269 \item Fallback is GC800 (\$250k) {\it contact me if you can do better!}
270 \end{itemize}
271 {\it Jacob Bachmeyer's Cache-control proposal turns L1 Cache into
272 scratchpad RAM. RVV is just too heavy (sorry!), Simple-V much
273 more light-weight and flexible.
274 }
275 }
276
277
278 \frame{\frametitle{TODO}
279
280 \begin{itemize}
281 \item TODO\vspace{8pt}
282 \end{itemize}
283 }
284
285
286 \frame{\frametitle{Summary}
287
288 \begin{itemize}
289 \item TODO
290 \end{itemize}
291 }
292
293
294 \frame{
295 \begin{center}
296 {\Huge The end\vspace{20pt}\\
297 Thank you\vspace{20pt}\\
298 Questions?\vspace{20pt}
299 }
300 \end{center}
301
302 \begin{itemize}
303 \item Discussion:
304 \item http://libre-riscv.org/shakti/m\_class/
305 \end{itemize}
306 }
307
308
309 \end{document}