rename page

author Luke Kenneth Casson Leighton <lkcl@lkcl.net>

Wed, 27 Jun 2018 09:59:12 +0000 (10:59 +0100)

committer Luke Kenneth Casson Leighton <lkcl@lkcl.net>

Wed, 27 Jun 2018 09:59:12 +0000 (10:59 +0100)
author Luke Kenneth Casson Leighton <lkcl@lkcl.net>
Wed, 27 Jun 2018 09:59:12 +0000 (10:59 +0100)
committer Luke Kenneth Casson Leighton <lkcl@lkcl.net>
Wed, 27 Jun 2018 09:59:12 +0000 (10:59 +0100)
diff --git a/shakti/m_class/libre_3d_gpu.mdwn b/shakti/m_class/libre_3d_gpu.mdwn

new file mode 100644 (file)

index 0000000..1d22554
--- /dev/null
+++ b/shakti/m_class/libre_3d_gpu.mdwn
@@ -0,0 +1,117 @@
+# Requirements
+
+## GPU 3D capabilities
+
+Based on GC800 the following would be acceptable performance
+(as would MALI400).
+
+* 35 million triangles/sec
+* 325 milllion pixels/sec
+* 6 GFLOPS
+
+## GPU size and power
+
+> 1.1. GPU size MUST be < 0.XX mm for ASICs after synthesis with
+> DesignCompiler tool using YY cell library at ZZ nm tech.
+
+basically the power requirement should be at or below around 1 watt
+in 40nm.  beyond 1 watt it becomes... difficult.   size is not
+particularly critical as such but should not be insane.
+
+so here's a table showing embedded cores:
+<https://www.cnx-software.com/2013/01/19/gpus-comparison-arm-mali-vs-vivante-gcxxx-vs-powervr-sgx-vs-nvidia-geforce-ulp/>
+
+GC800 has (in 40nm):
+
+* 35 million triangles/sec
+* 325 milllion pixels/sec
+* 6 GFLOPS
+* 1.9mm^2 synthesis area
+* 2.5mm^2 silicon area.
+
+silicon area corresponds *ROUGHLY* with power usage, but PLEASE do
+not take that as absolute, because if you read jeff's nyuzi 2016 paper
+you'll see that getting data through the L1/L2 cache barrier is by far
+and above the biggest eater of power.
+
+note lower down that the numbers for MALI400 are for the *4* core
+version - MALI400-MP4 - where jeff and i compared MALI400 SINGLE CORE
+and discovered that nyuzi, if 4 parallel nyuzi cores were put
+together, would reach only 25% of MALI400's performance (in about the
+same silicon area)
+
+## Other
+
+* Deadline = 12-18 months
+* The GPU is matched by the Gallium3D driver
+* RTL must be sufficient to run on an FPGA.
+* Software must be licensed under LGPLv2+ or BSD/MIT.
+* Hardware (RTL) must be licensed under BSD or MIT with no
+  "NON-COMMERCIAL" CLAUSES.
+* Any proposals will be competing against Vivante GC800 (using Etnaviv driver).
+* The GPU is integrated (like Mali400). So all that the GPU needs to do
+  is write to an area of memory (framebuffer or area of the framebuffer).
+  the SoC - which in this case has a RISC-V core and has peripherals such
+  as the LCD controller - will take care of the rest.
+* In this arcitecture, the GPU, the CPU and the peripherals are all on
+  the same AXI4 shared memory bus. They all have access to the same shared
+  DDR3/DDR4 RAM. So as a result the GPU will use AXI4 to write directly
+  to the framebuffer and the rest will be handle by SoC.
+* The job must be done by a team that shows sufficient expertise to
+  reduce the risk. (Do you mean a team with good CVs? What about if the
+  team shows you an acceptable FPGA prototype? I’m talking about a team
+  of students which do not have big industrial CVs but they know how to
+  handle this job (just like RocketChip or MIAOW or etc…).
+
+response:
+
+> Deadline = ?
+
+about 12-18 months which is really tight.  if an FPGA (or simulation)
+plus the basics of the software driver are at least prototyped by then
+it *might* be ok.
+
+if using nyuzi as the basis it *might* be possible to begin the
+software port in parallel because jeff went to the trouble of writing
+a cycle-accurate simulation.
+
+
+> The GPU must be matched by the Gallium3D driver
+
+that's the *recommended* approach, as i *suspect* it will result in less
+work than, for example, writing an entire OpenGL stack from scratch.
+
+
+> RTL must be sufficient to run on an FPGA.
+
+a *demo* must run on an FPGA as an initial
+
+> Software must be licensed under LGPLv2+ or BSD/MIT.
+
+and no other licenses.  GPLv2+ is out.
+
+> Hardware (RTL) must be licensed under BSD or MIT with no “NON-COMMERCIAL
+> CLAUSES”.
+> Any proposals will be competing against Vivante GC800 (using Etnaviv
+> driver).
+
+in terms of price, performance and power budget, yes.  if you look up
+the numbers (triangles/sec, pixels/sec, power usage, die area) you'll
+find it's really quite modest.  nyuzi right now requires FOUR times the
+silicon area of e.g. MALI400 to achieve the same performance as MALI400,
+meaning that the power usage alone would be well in excess of the budget.
+
+> The job must be done by a team that shows sufficient expertise to reduce the
+> risk. (Do you mean a team with good CVs? What about if the team shows you an
+> acceptable FPGA prototype?
+
+that would be fantastic as it would demonstrate not only competence
+but also committment.  and will have taken out the "risk" of being
+"unknown", entirely.
+
+> I’m talking about a team of students which do not
+> have big industrial CVs but they know how to handle this job (just like
+> RocketChip or MIAOW or etc…).
+
+ works perfectly for me :)
+
diff --git a/shakti/m_class/libre_3d_gpu.mwdn b/shakti/m_class/libre_3d_gpu.mwdn

deleted file mode 100644 (file)

index 1d22554..0000000
--- a/shakti/m_class/libre_3d_gpu.mwdn
+++ /dev/null
@@ -1,117 +0,0 @@
-# Requirements
-
-## GPU 3D capabilities
-
-Based on GC800 the following would be acceptable performance
-(as would MALI400).
-
-* 35 million triangles/sec
-* 325 milllion pixels/sec
-* 6 GFLOPS
-
-## GPU size and power
-
-> 1.1. GPU size MUST be < 0.XX mm for ASICs after synthesis with
-> DesignCompiler tool using YY cell library at ZZ nm tech.
-
-basically the power requirement should be at or below around 1 watt
-in 40nm.  beyond 1 watt it becomes... difficult.   size is not
-particularly critical as such but should not be insane.
-
-so here's a table showing embedded cores:
-<https://www.cnx-software.com/2013/01/19/gpus-comparison-arm-mali-vs-vivante-gcxxx-vs-powervr-sgx-vs-nvidia-geforce-ulp/>
-
-GC800 has (in 40nm):
-
-* 35 million triangles/sec
-* 325 milllion pixels/sec
-* 6 GFLOPS
-* 1.9mm^2 synthesis area
-* 2.5mm^2 silicon area.
-
-silicon area corresponds *ROUGHLY* with power usage, but PLEASE do
-not take that as absolute, because if you read jeff's nyuzi 2016 paper
-you'll see that getting data through the L1/L2 cache barrier is by far
-and above the biggest eater of power.
-
-note lower down that the numbers for MALI400 are for the *4* core
-version - MALI400-MP4 - where jeff and i compared MALI400 SINGLE CORE
-and discovered that nyuzi, if 4 parallel nyuzi cores were put
-together, would reach only 25% of MALI400's performance (in about the
-same silicon area)
-
-## Other
-
-* Deadline = 12-18 months
-* The GPU is matched by the Gallium3D driver
-* RTL must be sufficient to run on an FPGA.
-* Software must be licensed under LGPLv2+ or BSD/MIT.
-* Hardware (RTL) must be licensed under BSD or MIT with no
-  "NON-COMMERCIAL" CLAUSES.
-* Any proposals will be competing against Vivante GC800 (using Etnaviv driver).
-* The GPU is integrated (like Mali400). So all that the GPU needs to do
-  is write to an area of memory (framebuffer or area of the framebuffer).
-  the SoC - which in this case has a RISC-V core and has peripherals such
-  as the LCD controller - will take care of the rest.
-* In this arcitecture, the GPU, the CPU and the peripherals are all on
-  the same AXI4 shared memory bus. They all have access to the same shared
-  DDR3/DDR4 RAM. So as a result the GPU will use AXI4 to write directly
-  to the framebuffer and the rest will be handle by SoC.
-* The job must be done by a team that shows sufficient expertise to
-  reduce the risk. (Do you mean a team with good CVs? What about if the
-  team shows you an acceptable FPGA prototype? I’m talking about a team
-  of students which do not have big industrial CVs but they know how to
-  handle this job (just like RocketChip or MIAOW or etc…).
-
-response:
-
-> Deadline = ?
-
-about 12-18 months which is really tight.  if an FPGA (or simulation)
-plus the basics of the software driver are at least prototyped by then
-it *might* be ok.
-
-if using nyuzi as the basis it *might* be possible to begin the
-software port in parallel because jeff went to the trouble of writing
-a cycle-accurate simulation.
-
-
-> The GPU must be matched by the Gallium3D driver
-
-that's the *recommended* approach, as i *suspect* it will result in less
-work than, for example, writing an entire OpenGL stack from scratch.
-
-
-> RTL must be sufficient to run on an FPGA.
-
-a *demo* must run on an FPGA as an initial
-
-> Software must be licensed under LGPLv2+ or BSD/MIT.
-
-and no other licenses.  GPLv2+ is out.
-
-> Hardware (RTL) must be licensed under BSD or MIT with no “NON-COMMERCIAL
-> CLAUSES”.
-> Any proposals will be competing against Vivante GC800 (using Etnaviv
-> driver).
-
-in terms of price, performance and power budget, yes.  if you look up
-the numbers (triangles/sec, pixels/sec, power usage, die area) you'll
-find it's really quite modest.  nyuzi right now requires FOUR times the
-silicon area of e.g. MALI400 to achieve the same performance as MALI400,
-meaning that the power usage alone would be well in excess of the budget.
-
-> The job must be done by a team that shows sufficient expertise to reduce the
-> risk. (Do you mean a team with good CVs? What about if the team shows you an
-> acceptable FPGA prototype?
-
-that would be fantastic as it would demonstrate not only competence
-but also committment.  and will have taken out the "risk" of being
-"unknown", entirely.
-
-> I’m talking about a team of students which do not
-> have big industrial CVs but they know how to handle this job (just like
-> RocketChip or MIAOW or etc…).
-
- works perfectly for me :)
-
author	Luke Kenneth Casson Leighton <lkcl@lkcl.net>
	Wed, 27 Jun 2018 09:59:12 +0000 (10:59 +0100)
committer	Luke Kenneth Casson Leighton <lkcl@lkcl.net>
	Wed, 27 Jun 2018 09:59:12 +0000 (10:59 +0100)
shakti/m_class/libre_3d_gpu.mdwn	[new file with mode: 0644]	patch \| blob
shakti/m_class/libre_3d_gpu.mwdn	[deleted file]	patch \| blob \| history