# Resolving ISA conflicts and providing a pain-free RISC-V Standards Upgrade Path In a lengthy thread that ironically was full of conflict indicative of the future direction in which RISC-V will go if left unresolved, multiple Custom Extensions were noted to be permitted free rein to introduce global binary-encoding conflict with no means of resolution described or endorsed by the RISC-V Standard: a practice that has known disastrous and irreversible consequences for any architecture that permits such practices (1). Much later on in the discussion it was realised that there is also no way within the current RISC-V Specification to transition to improved versions of the standard, regardless of whether the fixes are absolutely critical show-stoppers or whether they are just keeping the standard up-to-date (2). With no transition path there is guaranteed to be tension and conflict within the RISC-V Community over whether revisions should be made: should existing legacy designs be prioritised, mutually-exclusively over future designs (and what happens during the transition period is absolute chaos). If several overlapping revisions are required that have not yet transitioned out of use (which could take well over two decades to occur) the situation becomes disastrous for the credibility of RISC-V. It was also pointed out that Compliance is an extremely important factor to take into consideration, and that Custom Extensions (as being optional) effectively fall entirely outside of the Compliance Testing. At this point in the discussion however it was not yet noted the stark problem that the *mandatory* RISC-V Specification also faces, by virtue of there being no transitional way to bring in show-stopping critical alterations. To put this into perspective, just taking into account hardware costs alone: with production mask charges for 28nm being around USD $1.5m, engineering development costs and licensing of RTLs for peripherals being of a similar magnitude, no manufacturer is going to back away from selling a "flawed" or "legacy" product (whether it complies with the RISC-V Specification or not) without a bitter fight. It was also pointed out that there will be significant software tool maintenance costs for manufacturers, meaning that the probability will be extremely high that they will refuse to shoulder such costs, and publish hopelessly out-of-date unpatched tools. This practice is well-known to result in security flaws going unpatched, with one of many immediate consequences being that product gets discarded into landfill. All and any of the issues that were discussed, and all of those that were not, can be avoided by providing a forwards and backwards compatible transition path between the current and future *mandatory* parts of revisions of the RISC-V ISA Standard. The rest of the discussion - indicative as it was of the stark mutually exclusive gap being faced by the RISC-V ISA Standard given that it does not cope with the problem - was an effort by two groups in two clear camps: one that wanted things to remain as they are, and another that made efforts to point out that the consequences of not taking action are clearly extreme and irreversible (which, unfortunately, given the severity, some of the first group were unable to believe, despite there being clear historical precedent for the same mistake being made in other architectures). However after a significant amount of time, certain clear requirements came out of the discussion: * Any proposal must be a minimal change with minimal (or zero) impact * Any proposal should place no restriction on existing or future ISA encoding space * Any proposal should take into account that there are existing implementors of the (yet to be finalised but still "partly frozen") Standard who may resist, for financial investment reasons, efforts to make any change (at all) that could cost them immediate short-term profits. Several proposals were put forward (and some are still under discussion) * "Do nothing": problem is not severe: no action needed. * "Do nothing": problem is out-of-scope for RISC-V Foundation. * "Do nothing": problem complicates Compliance Testing (and is out of scope) * "MISA": the MISA CSR enables and disables extensions already: use that * "MISA-like": a new CSR which switches in and out new encodings (without destroying state) * "mvendorid/marchid WARL": switching the entire "identity" of a machine * "ioctl-like": a OO proposal based around the linux kernel "ioctl" system. Each of these will be discussed below in their own sections. # Do nothing (no problem exists) TBD (basically not an option). There were several solutions offered that fell into this category. A few of them are listed in the introduction; more are listed below, and it was exhaustively (and exhaustingly) established that none of them are workable. Initially it was pointed out that Fabless Semiconductor companies could simply license multiple Custom Extensions and a suitable RISC-V core, and modify them accordingly. The Fabless Semi Company would be responsible for paying the NREs on re-developing the test vectors (as the extension licensers would be extremely unlikely to do that without payment), and given that said Companies have an "integration" job to do, it would be reasonable to expect them to have such additional costs as well. The costs of this approach were outlined and discussed as being disproportionate and extreme compared to the actual likely cost of licensing the Custom Extensions in the first place. Additionally it was pointed out that not only hardware NREs would be involved but custom software tools (compilers and more) would also be required (and maintained separately, on the basis that upstream would not accept them except under extreme pressure, and then only with prejudice). All similar schemes involving customisation of the custom extensions were likewise rejected, but not before the customisation process was mistakenly conflated with tne *normal* integration process of developing a custom processor (Bus Architectures, Cache layouts, peripheral layouts). The most compelling hardware-related reason (excluding the severe impact on the software ecosystem) for rejecting the customisation-of-customisation approach was the case where Extensions were using an instruction encoding space (48-bit, 64-bit) *greater* than that which the chosen core could cope with (32-bit, 48-bit). Overall, none of the options presented were feasible, and, in addition, even if they were followed through, still would result in the failure of the RISC-V ecosystem due to global conflicting ISA binary-encoding meanings (POWERPC's Altivec / SPE nightmare). # Do nothing (out of scope) TBD (basically, may not be RV Foundation's "scope", still results in problem, so not an option) This was one of the first arguments presented: The RISC-V Foundation considers Custom Extensions to be "out of scope"; that "it's not their problem, therefore there isn't a problem". The logical errors in this argument were quickly enumerated: namely that the RISC-V Foundation is not in control of the use-cases, such that binary-encoding is a hundred percent guaranteed to occur, and a hundred percent guaranteed to occur in *commodity* hardware where Debian, Fedora, SUSE and other distros will be hardest hit by the resultant chaos, and that will just be the more "visible" aspect of the underlying problem. # Do nothing (Compliance too complex, therefore out of scope) TBD (basically, may not be RV Foundation's "scope", still results in problem, so not an option) Two interestingly diametrically-opposed equally valid arguments exist here: * Whilst Compliance testing of Custom Extensions is definitely legitimately out of scope, Compliance testing of simultaneous legacy (old revisions of ISA Standards) and current (new revisions of ISA Standard) definitely is not. Efforts to reduce *Compliance Testing* complexity is therefore "Compliance Tail Wagging Standard Dog". * Beyond a certain threshold, complexity of Compliance Testing is so burdensome that it risks outright rejection of the entire Standard. Meeting these two diametrically-opposed perspectives requires that the solution be very, very simple. # MISA TBD, basically MISA not suitable MISA permits extensions to be disabled by masking out the relevant bit. Hypothetically it could be used to disable one extension, then enable another that happens to use the same binary encoding. *However*: * MISA Extension disabling is permitted (optionally) to **destroy** the state information. Thus it is totally unsuitable for cases where instructions from different Custom extensions are needed in quick succession. * MISA was only designed to cover Standard Extensions. * There is nothing to prevent multiple Extensions being enabled that wish to simultaneously interpret the same binary encoding. Overall, whilst the MISA concept is a step in the right direction it's a hundred percent unsuitable for solving the problem. # MISA-like TBD, basically same as mvend/march WARL except needs an extra CSR where mv/ma doesn't. # mvendorid/marchid WARL TBD paraphrase and clarify > In an earlier part of the thread someone kindly pointed out that MISA > already switches out entire sets of instructions [which interacts at the > "decode" phase]. However it was noted after a few days of investigating > that particular lead that: > > * MISA Extension disabling is permitted (optionally) to DESTROY the state > information (which means that it *has* to be re-initialised just to be > safe... mistake in the standard, there), and * MISA was only designed > to cover Standard Extensions. > > So the practice of switching extensions in and out - and the resultant > "disablement" and "enablement" at the *instruction decode phase* is > *already* a hard requirement as part of conforming with the present > RISC-V Specification. > > Around the same MISA discussion, someone else also kindly pointed out > that one solution to the heavyweight nature of the switching would > be to deliberately introduce a pipeline stall whilst the switching is > occurring: I can see the sense in that approach, even if I don't know the > full details of what each implementor might choose to do. They may even > choose two, or three, or N pipeline stalls: it really doesn't matter, > as it's an implementors' choice (and problem to solve). > > So yes it's pretty heavy-duty... and also already required. > > For the case where "legacy" variants of the RISC-V Standard are > backwards-forwards-compatibly supported over a 10-20 year period > in Industrial and Military/Goverment-procurement scenarios (so that > the impossible-to-achieve pressure is off to get the spec ABSOLUTELY > correct, RIGHT now), nobody would expect a seriously heavy-duty amount > of instruction-by-instruction switching: it'd be used pretty much once > and only once at boot-up (or once in a Hypervisor Virtual Machine client) > and that's it. > > I can however foresee instances where implementors would actually > genuinely want a bank of operations to be carried out using one extension, > followed immediately by another bank from a (conflicting binary-encoding) > extension, in an inner loop: Software-defined MPEG / MP4 decode to call > DCT block decode Custom Extension followed immediately by Custom Video > Processing Extension followed immediately by Custom DSP Processing > Extension to do YUV-to-RGB conversion for example is something that > is clearly desirable. Solving that one would be entiiirely their > problem... and the RISC-V Specification really really should give them > the space to do that in a clear-cut unambiguous way. # ioctl-like TBD - [[ioctl]] for full details, summary kept here # Discussion and analysis TBD # Conclusion TBD # Conversation Exerpts The following conversation exerpts are taken from the ISA-dev discussion ## (1) Albert Calahan on SPE / Altiven conflict in POWERPC > Yes. Well, it should be blocked via legal means. Incompatibility is > a disaster for an architecture. > > The viability of PowerPC was badly damaged when SPE was > introduced. This was a vector instruction set that was incompatible > with the AltiVec instruction set. Software vendors had to choose, > and typically the choice was "neither". Nobody wants to put in the > effort when there is uncertainty and a market fragmented into > small bits. > Note how Intel did not screw up. When SSE was added, MMX remained. > Software vendors could trust that instructions would be supported. > Both MMX and SSE remain today, in all shipping processors. With very > few exceptions, Intel does not ship chips with missing functionality. > There is a unified software ecosystem. > > This goes beyond the instruction set. MMU functionality also matters. > You can add stuff, but then it must be implemented in every future CPU. > You can not take stuff away without harming the architecture. ## (2) Luke Kenneth Casson Leighton on Standards backwards-compatibility > For the case where "legacy" variants of the RISC-V Standard are > backwards-forwards-compatibly supported over a 10-20 year period in > Industrial and Military/Goverment-procurement scenarios (so that the > impossible-to-achieve pressure is off to get the spec ABSOLUTELY > correct, RIGHT now), nobody would expect a seriously heavy-duty amount > of instruction-by-instruction switching: it'd be used pretty much once > and only once at boot-up (or once in a Hypervisor Virtual Machine > client) and that's it. ## (3) Allen Baum on Standards Compliance > Putting my compliance chair hat on: One point that was made quite > clear to me is that compliance will only test that an implementation > correctly implements the portions of the spec that are mandatory, and > the portions of the spec that are optional and the implementor claims > it is implementing. It will test nothing in the custom extension space, > and doesn't monitor or care what is in that space.