fc070176c3700efad38bc5d501c9366630226076
[libreriscv.git] / shakti / m_class / libre_3d_gpu.mwdn
1 # Requirements
2
3 ## GPU size and power
4
5 > 1.1. GPU size MUST be < 0.XX mm for ASICs after synthesis with
6 > DesignCompiler tool using YY cell library at ZZ nm tech.
7
8 basically the power requirement should be at or below around 1 watt
9 in 40nm. beyond 1 watt it becomes... difficult. size is not
10 particularly critical as such but should not be insane.
11
12 so here's a table showing embedded cores:
13 <https://www.cnx-software.com/2013/01/19/gpus-comparison-arm-mali-vs-vivante-gcxxx-vs-powervr-sgx-vs-nvidia-geforce-ulp/>
14
15 GC800 has (in 40nm):
16
17 * 35 million triangles/sec
18 * 325 milllion pixels/sec
19 * 6 GFLOPS
20 * 1.9mm^2 synthesis area
21 * 2.5mm^2 silicon area.
22
23 silicon area corresponds *ROUGHLY* with power usage, but PLEASE do
24 not take that as absolute, because if you read jeff's nyuzi 2016 paper
25 you'll see that getting data through the L1/L2 cache barrier is by far
26 and above the biggest eater of power.
27
28 note lower down that the numbers for MALI400 are for the *4* core
29 version - MALI400-MP4 - where jeff and i compared MALI400 SINGLE CORE
30 and discovered that nyuzi, if 4 parallel nyuzi cores were put
31 together, would reach only 25% of MALI400's performance (in about the
32 same silicon area)
33
34 ## Other
35
36
37 * Deadline = 12-18 months
38 * The GPU is matched by the Gallium3D driver
39 * RTL must be sufficient to run on an FPGA.
40 * Software must be licensed under LGPLv2+ or BSD/MIT.
41 * Hardware (RTL) must be licensed under BSD or MIT with no
42 "NON-COMMERCIAL" CLAUSES.
43 * Any proposals will be competing against Vivante GC800 (using Etnaviv driver).
44 * The GPU is integrated (like Mali400). So all that the GPU needs to do
45 is write to an area of memory (framebuffer or area of the framebuffer).
46 the SoC - which in this case has a RISC-V core and has peripherals such
47 as the LCD controller - will take care of the rest.
48 * In this arcitecture, the GPU, the CPU and the peripherals are all on
49 the same AXI4 shared memory bus. They all have access to the same shared
50 DDR3/DDR4 RAM. So as a result the GPU will use AXI4 to write directly
51 to the framebuffer and the rest will be handle by SoC.
52 * The job must be done by a team that shows sufficient expertise to
53 reduce the risk. (Do you mean a team with good CVs? What about if the
54 team shows you an acceptable FPGA prototype? I’m talking about a team
55 of students which do not have big industrial CVs but they know how to
56 handle this job (just like RocketChip or MIAOW or etc…).
57
58 response:
59
60 > Deadline = ?
61
62 about 12-18 months which is really tight. if an FPGA (or simulation)
63 plus the basics of the software driver are at least prototyped by then
64 it *might* be ok.
65
66 if using nyuzi as the basis it *might* be possible to begin the
67 software port in parallel because jeff went to the trouble of writing
68 a cycle-accurate simulation.
69
70
71 > The GPU must be matched by the Gallium3D driver
72
73 that's the *recommended* approach, as i *suspect* it will result in less
74 work than, for example, writing an entire OpenGL stack from scratch.
75
76
77 > RTL must be sufficient to run on an FPGA.
78
79 a *demo* must run on an FPGA as an initial
80
81 > Software must be licensed under LGPLv2+ or BSD/MIT.
82
83 and no other licenses. GPLv2+ is out.
84
85 > Hardware (RTL) must be licensed under BSD or MIT with no “NON-COMMERCIAL
86 > CLAUSES”.
87 > Any proposals will be competing against Vivante GC800 (using Etnaviv
88 > driver).
89
90 in terms of price, performance and power budget, yes. if you look up
91 the numbers (triangles/sec, pixels/sec, power usage, die area) you'll
92 find it's really quite modest. nyuzi right now requires FOUR times the
93 silicon area of e.g. MALI400 to achieve the same performance as MALI400,
94 meaning that the power usage alone would be well in excess of the budget.
95
96 > The job must be done by a team that shows sufficient expertise to reduce the
97 > risk. (Do you mean a team with good CVs? What about if the team shows you an
98 > acceptable FPGA prototype?
99
100 that would be fantastic as it would demonstrate not only competence
101 but also committment. and will have taken out the "risk" of being
102 "unknown", entirely.
103
104 > I’m talking about a team of students which do not
105 > have big industrial CVs but they know how to handle this job (just like
106 > RocketChip or MIAOW or etc…).
107
108 works perfectly for me :)
109