X-Git-Url: https://git.libre-soc.org/?a=blobdiff_plain;f=shakti%2Fm_class%2Flibre_riscv_chennai_2018.tex;h=9cb1e37d888e5c7bb30ecd3a86720564a34ccc41;hb=9b0fb702cdad21eeed19892d1cc71fe02454e9cb;hp=b21f3f1b907ad589edd6a1fb4a76148bb80ae691;hpb=86917b64118181dff74134264872be3b2b26dde7;p=libreriscv.git diff --git a/shakti/m_class/libre_riscv_chennai_2018.tex b/shakti/m_class/libre_riscv_chennai_2018.tex index b21f3f1b9..9cb1e37d8 100644 --- a/shakti/m_class/libre_riscv_chennai_2018.tex +++ b/shakti/m_class/libre_riscv_chennai_2018.tex @@ -27,11 +27,18 @@ \frame{\frametitle{Credits and Acknowledgements} \begin{itemize} - \item The Designers of RISC-V\vspace{15pt} - \item The Shakti Group\vspace{15pt} - \item Prof. G S Madhusudan\vspace{15pt} - \item Neel Gala\vspace{15pt} - \item Rishabh Jain\vspace{15pt} + \item The Designers of RISC-V + \item The RISC-V Foundation + \item The Shakti Group, and IIT Madras RISE Group + \item Prof. G S Madhusudan + \item Neel Gala + \item Rishabh Jain + \item Members of the RISC-V Open Groups (SW/HW/ISA) + \item Libre and Open Software and Hardware Communities + \item Richard Herveille (RoaLogic), Edmund Humenberger, Clifford Wolf + (Symbiotica EDA), Rudi (Asics.ws), Enjoy-Digital.fr, + Alex Forenchich, LowRISC Team + \item Anonymous Sponsor \end{itemize} } @@ -79,7 +86,7 @@ } -\frame{\frametitle{Does what we want already exist?} +\frame{\frametitle{Does what we want already exist? Surely this is nonsense!} \begin{center} \includegraphics[height=2.4in]{nolibresocs.jpg}\\ {\bf Analysis of SoCs over the past 7+ years (answer: no)} @@ -87,7 +94,7 @@ } -\frame{\frametitle{What's the problem?} +\frame{\frametitle{Breakdown of non-existence of fully-Libre SoCs} \begin{itemize} \item {\bf iMX6}: Libre bootable, Vivante 3D GPU (libre etnaviv) @@ -107,12 +114,12 @@ } -\frame{\frametitle{So what's needed? What would a good (Libre) SoC have?} +\frame{\frametitle{What would a good (Libre) boring, mundane SoC have?} \begin{itemize} \item Cover a lot of different scenarios (embedded, tablets, industrial, netbooks, crypto-currency mining). - \item Decent performance with high efficiency. RISC-V: 40 \% + \item Decent performance with high efficiency. RISC-V: 40\% more efficient than ARM / Intel. Shakti a good candidate: 2.5ghz and 120mW per core @ 22nm. \item 1080p video: y'all gotta watch cute kittens on youtube, right? @@ -137,10 +144,13 @@ \item Without a desirable product or customer(s): err... you don't.\\ (cf: definition of Business) \item By not having high NREs (leveraging back-to-back deals, - and helping others fulfil their needs) + and helping others fulfil their needs and goals) \end{itemize} {\it Detachment from the goal also helps. If someone else makes this - product then GREAT! I can go do something else} + product then GREAT! I can go do something else}\\ + \vspace{4pt} + {\bf Main point: please do not automatically assume Ethical and Libre is + non-commercial. It's not nice, and it's not helping } } @@ -149,14 +159,17 @@ \begin{itemize} \item Customer entrapment (through proprietary software).\\ Strong business case for not entrapping customers:\\ - https://tinyurl.com/most-productive-meeting-ever - \item Funding, endorsing, supporting or otherwise empowering - unethical Companies, Organisations and Individuals.\\ + \url{https://tinyurl.com/most-productive-meeting-ever} + \item Funding, endorsing, supporting or empowering unethical + Companies, Organisations, Cartels and Individuals.\\ (cf: definition of an ethical act). \item Being totally inflexible / unrealistic. Goals have to be met: it's no good being an idiot about that. e.g. if a Libre 3D GPU really can't be made, use Vivante GC800 (with etnaviv). + \item Spying back-door co-processors a no-no. Sovereignty + is critical. Russia has Baikal. China has Loongson. + \end{itemize} {\it Still no real show-stoppers to making money (or product): it's just slightly harder, that's all. Ultimately it's about @@ -193,6 +206,175 @@ } +\frame{\frametitle{Proprietary vs Libre-licensed Interface HDL} + + \begin{itemize} + \item DDR3/4: challenging! \$1m for single-use, single instance.\\ + Symbiotic EDA: \$600k for PHY; CERN developed a Controller\\ + \url{http://libre-riscv.org/shakti/m_class/DDR/} + \item HyperRAM (JEDEC xSPI): lower risk than DDR3/4\\ + \url{http://libre-riscv.org/shakti/m_class/HyperRAM/} + \item RGMII: several available (saves \$50k)\\ + \url{http://libre-riscv.org/shakti/m_class/RGMII/} + \item UART, SPI, I2C, PWM, SD/MMC: all libre (except eMMC). + \item Shakti Group has FlexBus, QuadSPI, SRAM, many more. + \item RGB/TTL: R. Herveille (SSD2828, SN75LVDS83b, TFP410a) + \end{itemize} + {\it Basically there's no compelling reason to spend vast sums + on proprietary HDL. Sorry Cadence / Mentor / Synopsis / whoever} +} + + +\frame{\frametitle{Challenging Stuff [1] - Memory Interfaces} + + \begin{itemize} + \item DDR3/4 PHYs are analog and very high speed. + Impedance training. Extreme timing tolerances on parallel buses.\\ + No surprise proprietary cost is USD \$1m and above. + \item Symbiotic EDA will do (Libre) PHY layout for USD \$300k, + time to completion for chosen geometry: 8-12 months. + \end{itemize} + {\it Silicon-proven but still risky. What are the alternatives?} + \vspace{4pt} + \begin{itemize} + \item FlexBus/SDRAM (low clock, lots of pins, single-data-rate). + \item HyperRAM (aka JEDEC xSPI) 8-bit SPI 166mhz or DDR-300.\\ + 300mbyte/sec for only 13 wires, not bad! (We'll take several)\\ + \url{http://libre-riscv.org/shakti/m_class/HyperRAM/} + \item HMC: insanely fast, very low power. OpenHMC (LGPL) + \url{https://opencores.org/project/openhmc} + \end{itemize} +} + + +\frame{\frametitle{Challenging Stuff [2] - Video Decode Engine} + + \begin{itemize} + \item Richard Herveille's Video Core Blocks\\ + https://opencores.org/project/video\_systems + \item Symbiotic EDA MP4 decoder in FPGA + \item H.264 seems to have been done...\\ + https://github.com/adsc-hls/synthesizable\_h264 + \item Really needs SIMD (or better, not-SIMD)\\ + \url{http://libre-riscv.org/simple_v_extension/} + \item Definitely needs xBitManip (parallelised by Simple-V)\\ + \url{https://github.com/cliffordwolf/xbitmanip} + \end{itemize} + {\it SIMD is insane. $O(N^6)$ opcode proliferation. See\\ + https://www.sigarch.org/simd-instructions-considered-harmful/ \\ + (1): P-Ext designed for Audio. (2): Investigate RI5CY's SIMD + } +} + + +\frame{\frametitle{Challenging Stuff [3] - Power Management} + + \begin{itemize} + \item Been done before (many times), but not as a Libre Design. + \item Sanjay Charagulla: GlobalFoundries 22nm mobile process + can reach as low as 0.4v + \item GPIO Banks need per-bank VREF (1.8v? to 3.3v)\\ + IO pads need built-in + level-shifting to convert to CPU VCORE + \item Each core needs independent variable-voltage capability + and independent shut-down (PMIC supplies external voltage) + \item DDR RAM still needs refreshing (even in sleep mode) + \item Extra RV32 (PicoRV32?) always-on core for wake-up / RTC + \item PLLs are Analog. fun fun fun in the sun sun sun... + \end{itemize} + {\it Really need help. PLLs, Analog stuff: specific + domain expertise. Fall-back example:} + \url{https://www.dolphin-integration.com}? + +} + + +\frame{\frametitle{Challenging Stuff [4] - Libre 3D GPU. Sigh.} + + \begin{itemize} + \item Actual requirements quite modest: 30MP/s 100MT/s 5GFLOPS + but power/area is crucial ($2mm^2$ @ 40nm, 1W) + \item Nyuzi, MIAOW, GPLGPU (Number Nine), OGP. + \item Nyuzi based on Larrabee. Jeff Bush really helpful. + \item MIAOW is an OpenCL engine. GPLGPU is fixed-function + \item Nyuzi lessons: Software-only rendering not enough. + Getting through L1 cache takes most power. Fixed functions + such as parallel FP-Quad to ARGB Pixel, and Z-Buffer + needed. + \item Fallback is GC800 (\$250k) {\it contact me if you can do better!} + \end{itemize} + {\it Jacob Bachmeyer's Cache-control proposal turns L1 Cache into + scratchpad RAM. RVV is just too heavy (sorry!), Simple-V much + more light-weight and flexible ($O(1)$ ISA proliferation) + } +} + + +\frame{\frametitle{Challenging Stuff [5] - Public Custom Extensions} + + \begin{itemize} + \item GPUs are usually done with incompatible ISAs and effectively + doing OpenGL over IPC / RPC (Remote Procedure Calls) + \item Much simpler: GPGPU "one ISA" approach. Custom-extend the + core ISA to handle 3D, use Gallium3D-LLVM. + \item Now add Video Extensions. and SIMD etc and + {\bf we are well beyond the only 2 available 32-bit custom opcodes} + \item Due to the Libre nature of this project, the custom opcode + space will be "dominated" by + high-profile public hard-forks of gcc, binutils, llvm etc. + Which isn't going to go down well. + \item ISA "Conflict Resolution" is therefore absolutely critical\\ + \url{http://libre-riscv.org/isa_conflict_resolution/} + \end{itemize} + {\it Remember Altivec. Learn from Intel. + \underline{This is everyone's problem.} + } +} + + +\frame{\frametitle{Interesting Missing Stuff [1] - Pinmux} + + \begin{itemize} + \item Pinmux: multiplexer of functions onto pins\\ + {\it DRAM Cell != DDR3/4, Mux Cell != Muxer} + \item Strategically extremely important to Commercial SoC success\\ + STMicro, Rockchip, Freescale, Samsung, TI, {\bf EVERYONE} + \item Bizarrely, a libre-licensed multi-way Pinmux doesn't exist.\\ + {\it not on anyone's radar. at all.} + SiFive IOF not enough. + \item Verification (scenario analysis) and auto-generation of + TRM, header files, device-tree files, pretty much everything + makes sense (to any "lazy" Software Engineer...) + \item Corporations with legacy pinmux unlikely to be interested. + \item \url{http://git.libre-riscv.org/?p=pinmux.git} \\ + \url{http://hands.com/~lkcl/pinmux\_chennai\_2018.pdf} + \end{itemize} +} + + +\frame{\frametitle{Interesting Missing Stuff [2] - AC97/I2S, USB2 PHY} + + +\begin{itemize} + \item Rudi (Asics.ws) donating time to create a Multi-Protocol + Audio Controller: AC97, PCM, PDM, I2S\\ + \url{http://libre-riscv.org/shakti/m_class/AC97/} + \item USB2 is... convoluted. UTMI-ULPI-USB2 PHY\\ + USB2-PHY not confirmed (Rudi has one)\\ + Also Rudi has DDR (8-pin) variant of ULPI + \url{http://libre-riscv.org/shakti/m_class/ULPI/} + \item USB3 not necessarily a good idea to put into Libre-RISCV\\ + Daisho USB3 Pipe exists, TUSB1310a PHY is 175 pin FBGA! + \item Libre SD/MMC typically at "Open" Level 20MB/sec appx. + Full spec and eMMC requires membership (obtained already). + \end{itemize} + {\it Trying to keep interfaces all-digital (USB3 isn't, + HP/Mic definitely isn't). Use + external PHYs or Multi-chip Module. + } +} + + \frame{\frametitle{TODO} \begin{itemize} @@ -204,7 +386,20 @@ \frame{\frametitle{Summary} \begin{itemize} - \item TODO + \item Making a commercially-desirable SoC is neither academically + nor standard-investor sexy! No AI. Boring. zzzz + \item Luckily there is an anonymous sponsor who needs an SoC that + doesn't exist (who knows the commercial benefits of Libre) + \item Shakti Group know the benefits (cost, sovereignty) of a Libre + Mobile-Class SoC as well (No spying on India citizens!) + \item A Libre GPU, even a modest performer (100T/s etc.) + is the biggest technical risk/unknown, besides DDR3/4.\\ + (fall-back is GC800. Do please help with a Libre GPU!) + \item DDR3/4 and eMMC are the main high-risk interfaces\\ + (there are fall-back strategies in place) + \item Ultimately the strategy is all about cost reduction + vs risk mitigation, + with Libre/Ethical prioritised over "convenience" \end{itemize} } @@ -218,8 +413,8 @@ \end{center} \begin{itemize} - \item Discussion: - \item http://libre-riscv.org/shakti/m\_class/ + \item Contact: lkcl@lkcl.net + \item \url{http://libre-riscv.org/shakti/m_class/} \end{itemize} }