From: Luke Kenneth Casson Leighton Date: Thu, 21 Feb 2019 13:45:37 +0000 (+0000) Subject: add fpu dev process update X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=0e8b1862597f9d6b2218ab8e371c422b2b0252f9;p=crowdsupply.git add fpu dev process update --- diff --git a/images/fpu-dev-screenshot.png b/images/fpu-dev-screenshot.png new file mode 100644 index 0000000..ae5b8c1 Binary files /dev/null and b/images/fpu-dev-screenshot.png differ diff --git a/updates/016_2019feb17_fpu_dev_process.mdwn b/updates/016_2019feb17_fpu_dev_process.mdwn new file mode 100644 index 0000000..c8c18dc --- /dev/null +++ b/updates/016_2019feb17_fpu_dev_process.mdwn @@ -0,0 +1,96 @@ +# FPU Development Progress + +# Development Practices + +Whenever I see people working with IDEs where the editor is operated full-screen +and yet the middle and right hand side is entirely devoid of text, I cringe. +I recently had to endure abuse and derision from an unethically-operated +company for suggesting full compliance with pep8 (pep8 requires maximum +80 characters per line). Yet at this same company, I was operating at a +commit rate of over 500 commits per month. This exceeded the commit +rate of the entire company of over 30 engineers. + +Which begs the obvious question: how on earth am I able to sustain such +a rapid development rate, and more to the point, why isn't anyone else? +A key difference is that, firstly, I flatly refuse to use graphical IDEs. +There's nothing that they provide which is of benefit to rapid development +that cannot be done faster with command-line tools and an efficient Desktop +layout: more than that, the time required to move a hand off the keyboard +and onto a mouse, then to locate the cursor, and then move the cursor, +and then click the mouse: all of that is time wasted. + +{fpu-dev-screenshot.png} + +[This post](http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-February/000567.html) +explains further, that it is essential to get as much information on-screen +as can possibly be managed with the computing resources available. +*Screen real-estate is king*. We talk in computing about Virtual Memory +"working set", which is the set of memory pages that need to be in physical +memory to avoid "thrashing" of swap-space; the **exact** same concept +applies to editing and development of source code and the online +research into APIs. + +Below is a video which gives some insights into the use of this development +methodology, as well as giving a walk-through of the nmigen conversion +process of Jon Dawson's excellent verilog IEEE754 FPU. + + + +# Conversion of Jon Dawson's IEEE754 FPU to nmigen + +The [initial conversion process](https://git.libre-riscv.org/?p=ieee754fpu.git;a=commit;h=d26d9dd46e9fd22a1f89357a6fbcecf0eb723f44) +has been extremely rapid. With Aleksander's help, the 32-bit adder was +working within around 2-3 days. Div quickly followed: next was conversion +to 64-bit. Multiply was then added, and Jon Dawson's unit tests adapted +and run, tens of thousands of unit tests passed and found several errors. +Interestingly, some of them were found to be from the *original* code. +This because John Hauser's softfloat-3 library was used, which is more +recent than Jon Dawson's work. + +As this is a general-purpose library, an announcement was made +and a discussion on +[librecores](https://lists.librecores.org/pipermail/discussion/2019-February/000687.html) +followed with some interesting questions: what's the gate latency (cycle +time). This question is still being evaluated. + +As the GPU is going to have FP16 support added, FP16 was added to the IEEE754 +unit with only a few actual lines of code, specifying the size of the +mantissa and exponent in *one base class* alone. When it came to adding +unit tests, it was quite straightforward to adopt Jon Hauser's unit test +code from FP32, however we encountered +[an anomaly](https://groups.google.com/a/groups.riscv.org/forum/#!topic/sw-dev/JuRuL5HEIPM). +When using sfpy (Berkeley softfloat-3 python bindings), adding zero or minus +zero to a non-canonical "NaN" resulted in the most weird responses from the +Softfloat Library. We're still tracking this down. + +# Converting to a pipeline + +This is where we are currently stumped, and lack of experience with nmigen +is showing through. The desired outcome is to adapt the code, which is +a state machine, so that it can be pipelined. However, as this is a +general-purpose library, and, also, for certain engines (particularly +DIV or SQRT), we would like to keep it as a state machine, the idea is +to create "Mixin" base classes that can make use of all the various +stages, creating a state machine where needed or a pipeline where needed, +without requiring maintenance of two near-identical codebases. + +Doing so is proving somewhat irksome, and efforts are beginning to bury +actual hardware logic under mounds of abstraction. Each stage needs +to have separate inputs and outputs, and for them to be joined together. +Several months ago a +[pipeline class](https://git.libre-riscv.org/?p=ieee754fpu.git;a=blob;f=src/add/pipeline_example.py;h=544b745b0a5d7b710b7d9eea38397acab5f4799a;hb=d26d9dd46e9fd22a1f89357a6fbcecf0eb723f44) +was identified, from PyRTL, and adapted to nmigen. It works by overloading +python getattr and setattr in a class, which then auto-creates member instance +signals with the appropriate names, "variable-n-stage-1" where the variable +"n" happens to have been made use of in stage 2 from stage 1. All of that +is hidden from the developer, leaving some extremely clear and obvious +code. + +The problem is that there doesn't exist a state-based version of the +same class, and, in addition, we are using classes *containing* Signals, +and the way to adapt this code is not clear. + +In essence we are running into a "two unknowns" scenario. Unfamiliarity +with nmigen *and* how to adapt the code, keeping it running at each +stage so that the unit tests always pass, is hampering decision-making. +A lot more thought is required.