X-Git-Url: https://git.libre-soc.org/?a=blobdiff_plain;f=shakti%2Fm_class%2Flibre_3d_gpu.mdwn;h=5d1f5d274d4b759736d34c1a5aaa71c731cc99c8;hb=4894bd75d121bca8e580b3ee81dc7560cf3f9315;hp=9c487480a20b847e8219f8e85fe1e11415fb50ee;hpb=f646adc66bbad37aba81cc7e7b1b1e6e35563621;p=libreriscv.git
diff --git a/shakti/m_class/libre_3d_gpu.mdwn b/shakti/m_class/libre_3d_gpu.mdwn
index 9c487480a..5d1f5d274 100644
--- a/shakti/m_class/libre_3d_gpu.mdwn
+++ b/shakti/m_class/libre_3d_gpu.mdwn
@@ -83,6 +83,33 @@ And the assessment, design and implementation is being done here:
+----
+
+My feeling on this is therefore that the following approach is one which involve minimal work:
+
+* Investigate the ChiselGPU code to see if it can be leveraged (an "image" added instead of straight ARGB color).
+* OR... add sufficient fixed-function 3D instructions (plus a memory scratch area) to RISC-V to do the equivalent job.
+* Implement the Simple-V RISC-V "parallelism" extension (which can parallelize xBitManip *and* the above-suggested 3D fixed-function instructions).
+* Wait for RISC-V LLVM to have vectorization support added to it.
+* MODIFY the resultant RISC-V LLVM code so that it supports Simple-V.
+* Grab the gallium3d-llvm source code and hit the "compile" button.
+* Grab the *standard* Mesa3D library, tell it to use the gallium3d-llvm library and hit the "compile" button.
+* see what happens.
+
+Now, interestingly, if spike is thrown into the mix there (as a cycle-accurate RISC-V simulator) it should be perfectly well possible to get an idea of where performance of the above would need optimization, just like Jeff did with the Nyuzi paper.
+
+He focussed on specific algorithms and checked the assembly code, and worked out how many instruction cycles per pixel were needed, which is an invaluable measure.
+
+As I mention in the above page, one of the problems with doing a completely separate engine (Nyuzi is actually a general-purpose RISC-based vector processor) is that when it comes to using it, you need to transfer all the "state" data structures from the main core over to the GPU's core.
+
+... But if the main core is RISC-V *and the GPU is RISC-V as well* and they are SMP cores then transferring the state is a simple matter of doing a context-switch... or if *all* cores have vector and 3D instruction extensions, a context-switch is not needed at all.
+
+Will that approach work? Honestly I have absolutely no idea, but it would be a fascinating and extremely ambitious research project.
+
+Can we get people to fund it? Yeah I do. there's a lot of buzz about RISC-V, and a lot of buzz can be created about a libre 3D GPU. If that same GPU happens to be good at doing crypto-currency mining there will be a LOT more attention paid, particularly given that people have noticed that relying on proprietary GPUs and CPUs to manage billions of dollars worth of crypto-currency, when the NSA is *known* to have blackmailed intel into putting a spying back-door co-processor in to x86, and that it miiight not be a good idea to trust proprietary hardware
+
+
+
## Q & A
> Q: