X-Git-Url: https://git.libre-soc.org/?p=libreriscv.git;a=blobdiff_plain;f=isa_conflict_resolution.mdwn;h=8125aa392c63c24b80d0768376037d794ade3966;hp=1002440fd74dfd770db09976dec8596dbd560396;hb=HEAD;hpb=d4d882496dfba26f945e7dabc70de693807a0e7e diff --git a/isa_conflict_resolution.mdwn b/isa_conflict_resolution.mdwn deleted file mode 100644 index 1002440fd..000000000 --- a/isa_conflict_resolution.mdwn +++ /dev/null @@ -1,355 +0,0 @@ -# Resolving ISA conflicts and providing a pain-free RISC-V Standards Upgrade Path - -In a lengthy thread that ironically was full of conflict indicative -of the future direction in which RISC-V will go if left unresolved, -multiple Custom Extensions were noted to be permitted free rein to -introduce global binary-encoding conflict with no means of resolution -described or endorsed by the RISC-V Standard: a practice that has known -disastrous and irreversible consequences for any architecture that -permits such practices (1). - -Much later on in the discussion it was realised that there is also no way -within the current RISC-V Specification to transition to improved versions -of the standard, regardless of whether the fixes are absolutely critical -show-stoppers or whether they are just keeping the standard up-to-date (2). - -With no transition path there is guaranteed to be tension and conflict -within the RISC-V Community over whether revisions should be made: -should existing legacy designs be prioritised, mutually-exclusively over -future designs (and what happens during the transition period is absolute -chaos, with the compiler toolchain, software ecosystem and ultimately -the end-users bearing the full brunt of the impact). If several -overlapping revisions are required that have not yet transitioned out -of use (which could take well over two decades to occur) the situation -becomes disastrous for the credibility of the entire RISC-V ecosystem. - -It was also pointed out that Compliance is an extremely important factor -to take into consideration, and that Custom Extensions (as being optional) -effectively and quite reasonably fall entirely outside of the scope of -Compliance Testing. At this point in the discussion however it was not -yet noted the stark problem that the *mandatory* RISC-V Specification -also faces, by virtue of there being no transitional way to bring in -show-stopping critical alterations. - -To put this into perspective, just taking into account hardware costs -alone: with production mask charges for 28nm being around USD $1.5m, -engineering development costs and licensing of RTLs for peripherals -being of a similar magnitude, no manufacturer is going to back away -from selling a "flawed" or "legacy" product (whether it complies with -the RISC-V Specification or not) without a bitter fight. - -It was also pointed out that there will be significant software tool -maintenance costs for manufacturers, meaning that the probability will -be extremely high that they will refuse to shoulder such costs, and -will publish and continue to publish (and use) hopelessly out-of-date -unpatched tools. This practice is well-known to result in security -flaws going unpatched, with one of many immediate undesirable consequences -being that product in extremely large volume gets discarded into landfill. - -**All and any of the issues that were discussed, and all of those that -were not, can be avoided by providing a hardware-level runtime-enabled -forwards and backwards compatible transition path between *all* parts -(mandatory or not) of current and future revisions of the RISC-V ISA -Standard.** - -The rest of the discussion - indicative as it was of the stark mutually -exclusive gap being faced by the RISC-V ISA Standard given that it does -not cope with the problem - was an effort by two groups in two clear -camps: one that wanted things to remain as they are, and another that -made efforts to point out that the consequences of not taking action -are clearly extreme and irreversible (which, unfortunately, given the -severity, some of the first group were unable to believe, despite there -being clear historical precedent for the exact same mistake being made in -other architectures, and the consequences on the same being absolutely -clear). - -However after a significant amount of time, certain clear requirements came -out of the discussion: - -* Any proposal must be a minimal change with minimal (or zero) impact -* Any proposal should place no restriction on existing or future - ISA encoding space -* Any proposal should take into account that there are existing implementors - of the (yet to be finalised but still "partly frozen") Standard who may - resist, for financial investment reasons, efforts to make any change - (at all) that could cost them immediate short-term profits. - -Several proposals were put forward (and some are still under discussion) - -* "Do nothing": problem is not severe: no action needed. -* "Do nothing": problem is out-of-scope for RISC-V Foundation. -* "Do nothing": problem complicates Compliance Testing (and is out of scope) -* "MISA": the MISA CSR enables and disables extensions already: use that -* "MISA-like": a new CSR which switches in and out new encodings - (without destroying state) -* "mvendorid/marchid WARL": switching the entire "identity" of a machine -* "ioctl-like": a OO proposal based around the linux kernel "ioctl" system. - -Each of these will be discussed below in their own sections. - -# Do nothing (no problem exists) - -TBD (basically not an option). - -There were several solutions offered that fell into this category. -A few of them are listed in the introduction; more are listed below, -and it was exhaustively (and exhaustingly) established that none of -them are workable. - -Initially it was pointed out that Fabless Semiconductor companies could -simply license multiple Custom Extensions and a suitable RISC-V core, and -modify them accordingly. The Fabless Semi Company would be responsible -for paying the NREs on re-developing the test vectors (as the extension -licensers would be extremely unlikely to do that without payment), and -given that said Companies have an "integration" job to do, it would -be reasonable to expect them to have such additional costs as well. - -The costs of this approach were outlined and discussed as being -disproportionate and extreme compared to the actual likely cost of -licensing the Custom Extensions in the first place. Additionally it -was pointed out that not only hardware NREs would be involved but -custom software tools (compilers and more) would also be required -(and maintained separately, on the basis that upstream would not -accept them except under extreme pressure, and then only with -prejudice). - -All similar schemes involving customisation of the custom extensions -were likewise rejected, but not before the customisation process was -mistakenly conflated with tne *normal* integration process of developing -a custom processor (Bus Architectures, Cache layouts, peripheral layouts). - -The most compelling hardware-related reason (excluding the severe impact on -the software ecosystem) for rejecting the customisation-of-customisation -approach was the case where Extensions were using an instruction encoding -space (48-bit, 64-bit) *greater* than that which the chosen core could -cope with (32-bit, 48-bit). - -Overall, none of the options presented were feasible, and, in addition, -with no clear leadership from the RISC-V Foundation on how to avoid -global world-wide encoding conflict, even if they were followed through, -still would result in the failure of the RISC-V ecosystem due to -irreversible global conflicting ISA binary-encoding meanings (POWERPC's -Altivec / SPE nightmare). - -This in addition to the case where the RISC-V Foundation wishes to -fix a critical show-stopping update to the Standard, post-release, -where billions of dollars have been spent on deploying RISC-V in the -field. - -# Do nothing (out of scope) - -TBD (basically, may not be RV Foundation's "scope", still results in -problem, so not an option) - -This was one of the first arguments presented: The RISC-V Foundation -considers Custom Extensions to be "out of scope"; that "it's not their -problem, therefore there isn't a problem". - -The logical errors in this argument were quickly enumerated: namely that -the RISC-V Foundation is not in control of the uses to which RISC-V is -put, such that public global conflicts in binary-encoding are a hundred -percent guaranteed to occur, and a hundred percent guaranteed to occur in -*commodity* hardware where Debian, Fedora, SUSE and other distros will -be hardest hit by the resultant chaos, and that will just be the more -"visible" aspect of the underlying problem. - -# Do nothing (Compliance too complex, therefore out of scope) - -TBD (basically, may not be RV Foundation's "scope", still results in -problem, so not an option) - -The summary here was that Compliance testing of Custom Extensions is -not just out-of-scope, but even if it was taken into account that -binary-encoding meanings could change, it would still be out-of-scope. - -However at the time that this argument was made, it had not yet been -appreciated fully the impact that revisions to the Standard would have, -when billions of dollars worth of (older, legacy) RISC-V hardware had -already been deployed. - -Two interestingly diametrically-opposed equally valid arguments exist here: - -* Whilst Compliance testing of Custom Extensions is definitely legitimately - out of scope, Compliance testing of simultaneous legacy (old revisions of - ISA Standards) and current (new revisions of ISA Standard) definitely - is not. Efforts to reduce *Compliance Testing* complexity is therefore - "Compliance Tail Wagging Standard Dog". -* Beyond a certain threshold, complexity of Compliance Testing is so - burdensome that it risks outright rejection of the entire Standard. - -Meeting these two diametrically-opposed perspectives requires that the -solution be very, very simple. - -# MISA - -TBD, basically MISA not suitable - -MISA permits extensions to be disabled by masking out the relevant bit. -Hypothetically it could be used to disable one extension, then enable -another that happens to use the same binary encoding. - -*However*: - -* MISA Extension disabling is permitted (optionally) to **destroy** - the state information. Thus it is totally unsuitable for cases - where instructions from different Custom extensions are needed in - quick succession. -* MISA was only designed to cover Standard Extensions. -* There is nothing to prevent multiple Extensions being enabled - that wish to simultaneously interpret the same binary encoding. - -Overall, whilst the MISA concept is a step in the right direction it's -a hundred percent unsuitable for solving the problem. - -# MISA-like - -TBD, basically same as mvend/march WARL except needs an extra CSR where -mv/ma doesn't. - -Out of the MISA discussion came a "MISA-like" proposal, which would -take into account the flaws pointed out by trying to use "MISA": - -* The MISA-like CSR's meaning would be identified by compilers using the - mvendor-id/march-id tuple as a compiler target -* Each custom-defined bit of the MISA-like CSR would (mutually-exclusively) - redirect binary encoding(s) to specific encodings -* No Extension would *actually* be disabled: its internal state would - be left on (permanently) so that switching could be done inside - inner loops. - -Whilst it was the first "workable" solution it was also noted that the -scheme is quite invasive: it requires an entirely new CSR to be added -to the privileged spec. This does not completely fulfil the "minimum -impact" requirement. - -Also interesting around the same time an additional discussion was -raised that covered the *compiler* side of the same equation. This -revolved around using mvendorid-marchid tuples at the compiler level, -to be put into assembly output (by gcc), preserving the required -*globally* unique identifying information for binutils to successfully -turn the custom instruction into an actual binary-encoding (plus -binary-encoding of the context-switching information). (**TBD, Jacob, -separate page? review this para?**) - -# mvendorid/marchid WARL - -TBD paraphrase and clarify - -Coming out of the software-related proposal by Jacob, which hinged on -the idea of a global gcc / binutils database that kept and coordinated -architectural encodings, was to quite simply make the mvendorid and -marchid CSRs have WARL (writeable) characteristics. For instances -where mvendorid and marchid are readable, that would be taken to be -a Standards-mandatory "declaration" that the architecture has *no* -Custom Extensions. - -This incredibly simple non-invasive idea has some unique and distinct -advantages over other proposals: - -* Existing designs - even though the specification is not finalised - (but has "frozen" aspects) - would be completely unaffected: the - change is to the "wording" of the specification to "retrospectively" - fit reality. - -> In an earlier part of the thread someone kindly pointed out that MISA -> already switches out entire sets of instructions [which interacts at the -> "decode" phase]. However it was noted after a few days of investigating -> that particular lead that: -> -> * MISA Extension disabling is permitted (optionally) to DESTROY the state -> information (which means that it *has* to be re-initialised just to be -> safe... mistake in the standard, there), and * MISA was only designed -> to cover Standard Extensions. -> -> So the practice of switching extensions in and out - and the resultant -> "disablement" and "enablement" at the *instruction decode phase* is -> *already* a hard requirement as part of conforming with the present -> RISC-V Specification. -> -> Around the same MISA discussion, someone else also kindly pointed out -> that one solution to the heavyweight nature of the switching would -> be to deliberately introduce a pipeline stall whilst the switching is -> occurring: I can see the sense in that approach, even if I don't know the -> full details of what each implementor might choose to do. They may even -> choose two, or three, or N pipeline stalls: it really doesn't matter, -> as it's an implementors' choice (and problem to solve). -> -> So yes it's pretty heavy-duty... and also already required. -> -> For the case where "legacy" variants of the RISC-V Standard are -> backwards-forwards-compatibly supported over a 10-20 year period -> in Industrial and Military/Goverment-procurement scenarios (so that -> the impossible-to-achieve pressure is off to get the spec ABSOLUTELY -> correct, RIGHT now), nobody would expect a seriously heavy-duty amount -> of instruction-by-instruction switching: it'd be used pretty much once -> and only once at boot-up (or once in a Hypervisor Virtual Machine client) -> and that's it. -> -> I can however foresee instances where implementors would actually -> genuinely want a bank of operations to be carried out using one extension, -> followed immediately by another bank from a (conflicting binary-encoding) -> extension, in an inner loop: Software-defined MPEG / MP4 decode to call -> DCT block decode Custom Extension followed immediately by Custom Video -> Processing Extension followed immediately by Custom DSP Processing -> Extension to do YUV-to-RGB conversion for example is something that -> is clearly desirable. Solving that one would be entiiirely their -> problem... and the RISC-V Specification really really should give them -> the space to do that in a clear-cut unambiguous way. - -# ioctl-like - -TBD - [[ioctl]] for full details, summary kept here - -# Discussion and analysis - -TBD - -# Conclusion - -TBD - -# Conversation Exerpts - -The following conversation exerpts are taken from the ISA-dev discussion - -## (1) Albert Calahan on SPE / Altiven conflict in POWERPC - -> Yes. Well, it should be blocked via legal means. Incompatibility is -> a disaster for an architecture. -> -> The viability of PowerPC was badly damaged when SPE was -> introduced. This was a vector instruction set that was incompatible -> with the AltiVec instruction set. Software vendors had to choose, -> and typically the choice was "neither". Nobody wants to put in the -> effort when there is uncertainty and a market fragmented into -> small bits. -> Note how Intel did not screw up. When SSE was added, MMX remained. -> Software vendors could trust that instructions would be supported. -> Both MMX and SSE remain today, in all shipping processors. With very -> few exceptions, Intel does not ship chips with missing functionality. -> There is a unified software ecosystem. -> -> This goes beyond the instruction set. MMU functionality also matters. -> You can add stuff, but then it must be implemented in every future CPU. -> You can not take stuff away without harming the architecture. - -## (2) Luke Kenneth Casson Leighton on Standards backwards-compatibility - -> For the case where "legacy" variants of the RISC-V Standard are -> backwards-forwards-compatibly supported over a 10-20 year period in -> Industrial and Military/Goverment-procurement scenarios (so that the -> impossible-to-achieve pressure is off to get the spec ABSOLUTELY -> correct, RIGHT now), nobody would expect a seriously heavy-duty amount -> of instruction-by-instruction switching: it'd be used pretty much once -> and only once at boot-up (or once in a Hypervisor Virtual Machine -> client) and that's it. - -## (3) Allen Baum on Standards Compliance - -> Putting my compliance chair hat on: One point that was made quite -> clear to me is that compliance will only test that an implementation -> correctly implements the portions of the spec that are mandatory, and -> the portions of the spec that are optional and the implementor claims -> it is implementing. It will test nothing in the custom extension space, -> and doesn't monitor or care what is in that space. -