add slids

[libreriscv.git] / isa_conflict_resolution.mdwn
diff --git a/isa_conflict_resolution.mdwn b/isa_conflict_resolution.mdwn

index 5538a7add9af251063c15808234636ec3a0cc313..8125aa392c63c24b80d0768376037d794ade3966 100644 (file)
--- a/isa_conflict_resolution.mdwn
+++ b/isa_conflict_resolution.mdwn
@@ -1,5 +1,8 @@
  # Resolving ISA conflicts and providing a pain-free RISC-V Standards Upgrade Path
  
+**Note: out-of-date as of review 31apr2018, requires updating to reflect
+"mvendorid-marchid-isamux" concept.**
+
  ## Executive Summary
  
  A non-invasive backwards-compatible change to make mvendorid and marchid
@@ -272,15 +275,22 @@ separate page?  review this para?**)
  (Summary: the only idea that meets the full requirements.  Needs
   toolchain backup, but only when the first chip is released)
  
+This proposal has full details at the following page:
+[[mvendor_march_warl]]
+
  Coming out of the software-related proposal by Jacob Bachmeyer, which
  hinged on the idea of a globally-maintained gcc / binutils database
  that kept and coordinated architectural encodings (curated by the Free
  Software Foundation), was to quite simply make the mvendorid and marchid
-CSRs have WARL (writeable) characteristics.  For instances where mvendorid
-and marchid are readable, that would be taken to be a Standards-mandatory
-"declaration" that the architecture has *no* Custom Extensions (and that
-it conforms precisely to one and only one specific variant of the
-RISC-V Specification).
+CSRs have WARL (writeable) characteristics.  Read-only is taken to
+mean a declaration of "Having no Custom Extensions" (a zero-impact
+change).
+
+By making mvendorid-marchid tuples WARL the instruction decode phase
+may re-route mutually-exclusively to different engines, thus providing
+a controlled means and method of supporting multiple (future, past and
+present) versions of the **Base** ISA, Custom Extensions and even
+completely foreign ISAs in the same processor.
  
  This incredibly simple non-invasive idea has some unique and distinct
  advantages over other proposals:
@@ -302,101 +312,117 @@ advantages over other proposals:
    an inner loop, with a single instruction (to the WARL register)
    changing the meaning.
  
-A couple of points were made:
-
-* Compliance Testing may **fail** any system that has mvendorid/marchid
-  as WARL.  This however is a clear case of "Compliance Tail Wagging Standard
-  Dog".
-* The redirection of meaning of certain binary encodings to multiple
-  engines was considered extreme, eyebrow-raising, and also (importantly)
-  potentially expensive, introducing significant latency at the decode
-  phase.
-
-On this latter point, it was observed that MISA already switches out entire
-sets of instructions (interacts at the "decode" phase).  The difference
-between what MISA does and the mvendor/march-id WARL idea is that whilst
-MISA only switches instruction decoding on (or off), the WARL idea
-*redirects* encoding, effectively to *different* simultaneous engines,
-fortunately in a deliberately mutually-exclusive fashion.
-
-Implementations would therefore, in each Extension (assuming one separate
-"decode" engine per Extension), simply have an extra (mutually-exclusively
-enabled) wire in the AND gate for any given binary encoding, and in this
-way there would actually be very little impact on the latency.  The assumption
-here is that there are not dozens of Extensions vying for the same binary
-encoding (at which point the Fabless Semi Company has other much more
-pressing issues to deal with that make resolving binary encoding conflicts
-trivial by comparison).
-
-Also pointed out was that in certain cases pipeline stalls could be introduced
-during the switching phase, if needed, just as they may be needed for
-correct implementation of (mandatory) support for MISA.
-
  **This is the only one of the proposals that meet the full requirements**
  
-Update 29apr2018:
-
-* In cases where mvendorid and marchid are WARL, the mvendorid-marchid becomes
-  part of the execution context that must be saved (and switched as necessary)
-  just like any other state / CSR.
-* When any trap exception is raised the context / state *must not* be altered
-  (so that it can be properly saved, if needed, by the exception handler)
-  and that includes the current mvendorid-marchid tuple.  This leads to some
-  interesting situations where a hart could conceivably be directed
-  to a set of trap handler binary instructions that the current
-  mvendorid-marchid setting is incapable of correctly interpreting.
-  To fix this it will be necessary for implementations (hardware /
-  software) to set up separate per-mvendorid-marchid trap handlers and
-  for the hardware (or software) to switch to the appropriate trap "set"
-  when the mvendorid-marchid is written to.  The switch to a different
-  "set" will almost undoubtedly require (transparent) hardware assistance.
-* It's been noted that there may be certain legitimate cases where an
-  mvendorid-marchid should *specifically* not be tested for RISC-V
-  Certification Compliance: native support for foreign architectures
-  (not related to the JIT Extension: *actual* full entire non-RISC-V
-  foreign instruction encoding).  Exactly how this would work (vis-a-vis
-  Compliance) needs discussion, as it would be unfortunate and
-  undesirable for a hybrid processor capable of executing more than one
-  hardware-level ISA support to not be permitted to receive RISC-V
-  Certification Compliance.
-
-# ioctl-like <a name="ioctl-like"></a>
-
-(Summary: good solid orthogonal idea.  See [[ioctl]] for full details)
+# Overloadable opcodes <a name="overloadable opcodes"></a>
+
+See [[overloadable opcodes]] for full details, including a description in terms of C functions.
  
  NOTE: under discussion.
  
-This proposal basically mirrors the concept of POSIX ioctls, providing
-(arbitrarily) 8 functions (opcodes) whose meaning may be over-ridden
-in an object-orientated fashion by calling an "open handle" (and close)
-function (instruction) that switches (redirects) the 8 functions over to
-different opcodes.
+==RB 2018-5-1 dropped IOCTL proposal for the much simpler overloadable opcodes proposal== 
  
-The proposal is functionally near-identical to that of the mvendor/march-id
-except extended down to individual opcodes.  As such it could hypothetically
-be proposed as an independent Standard Extension in its own right that extends
-the Custom Opcode space *or* fits into the brownfield spaces within the
-existing ISA opcode space *or* is used as the basis of an independent
-Custom Extension in its own right.
+The overloadable opcode (or xext) proposal allows a non standard extension to use a documented 20 + 3 bit   (or 52 + 3 bit on RV64) UUID identifier for an instruction for _software_ to use. At runtime, a cpu translates the UUID to a small implementation defined 12 + 3 bit bit identifier for _hardware_ to use. It also defines a fallback mechanism for the UUID's of instructions the cpu does not recognise.  
  
-==RB==
-I really think it should be in browncode
-==RB==
+The overloadable opcodes proposal defines 8 standardised R-type instructions xcmd0, xcmd1, ...xcmd7 preferably in the brownfield opcode space. 
+Each xcmd takes in rs1 a 12 bit "logical unit" (lun) identifying a device on the cpu that implements some "extension interface" (xintf) together with some additional data. An xintf is a set of up to 8 commands with 2 input and 1 output port (i.e. like an R-type instruction), together with a description of the semantics of the commands. Calling e.g. xcmd3 routes its two inputs and one output ports to command 3 on the device determined by the lun bits in rs1. Thus, the 8 standard xcmd instructions are standard-designated overloadable opcodes, with the non standard semantics of the opcode determined by the lun. 
  
-One of the reasons for seeking an extension of the Custom opcode space is
-that the Custom opcode space is severely limited: only 2 opcodes are free
-within the 32-bit space, and only four total remain in the 48 and 64-bit
-space.
+Portable software, does not use luns directly. Instead, it goes through a level of indirection using a further instruction xext that translates a 20 bit globally unique identifier UUID of an xintf, to the lun of a device on the cpu that implements that xintf. The cpu can do this, because it knows (at manufacturing or boot time) which devices it has, and which xintfs they provide. This includes devices that would be described as non standard extension of the cpu if the designers had used custom opcodes instead of xintf as an interface. If the UUID of the xintf is not recognised at the current privilege level, the xext instruction returns the special lun = 0, causing any xcmd to trap. Minor variations of this scheme (requiring two more instructions) cause xcmd instructions to fallback to always return 0 or -1 instead of trapping. 
+
+The 20 bit provided by the UUID of the xintf is much more room than provided by the 2 custom 32 bit, or even 4 custom 64/48 bit opcode spaces. Thus the overloadable opcodes proposal avoids most of the need to put a claim on opcode space and the associated collisions when combining independent extensions. In this respect it is similar to POSIX ioctls, which obviate the need for defining new syscalls to control new and nonstandard hardware.
+
+Remark1: the main difference with a previous "ioctl like proposal" is that UUID translation is stateless and does not use resources. The xext instruction _neither_ initialises a device _nor_ builds global state identified by a cookie. If a device needs initialisation it can do this using xcmds as init and deinit instructions. Likewise, it can hand out cookies (which can include the lun) as a return value .
+
+Remark2: Implementing devices can respond to an (essentially) arbitrary number of xintfs. Hence an implementing device can respond to an arbitrary number of commands. Organising related commands in xintfs, helps avoid UUID space pollution, and allows to amortise the (small) cost of UUID to lun translation if related commands are used in combination.
+
+==RB not sure if this is still correct and relevant==
+
+The proposal is functionally similar to that of the mvendor/march-id
+except the non standard extension is explicit and restricted to a small set of well defined individual opcodes. 
+Hence several extensions can be mixed and there is no state to be tracked over context switches. 
+As such it could hypothetically be proposed as an independent Standard Extension.
  
  Despite the proposal (which is still undergoing clarification)
  being worthwhile in its own right, and standing on its own merits and
-thus definitely worthwhile pursuing, it is non-trivial and much more
+thus definitely worthwhile pursuing, it is non-trivial and more
  invasive than the mvendor/march-id WARL concept.
  
+==RB==
+
  # Comments, Discussion and analysis
  
  TBD: placeholder as of 26apr2018
  
+## new (old) m-a-i tuple idea
+
+> actually that's a good point: where the user decides that they want
+> to boot one and only one tuple (for the entire OS), forcing a HARDWARE
+> level default m-a-i tuple at them actually prevents and prohibits them
+> from doing that, Jacob.
+> 
+> so we have apps on one RV-Base ISA and apps on an INCOMPATIBLE (future)
+> variant of RV-Base ISA.  with the approach that i was advocating (S-mode
+> does NOT switch automatically), there are totally separate mtvec /
+> stvec / bstvec traps.
+> 
+> would it be reasonable to assume the following:
+> 
+> (a) RV-Base ISA, particularly code-execution in the critical S-mode
+> trap-handling, is *EXTREMELY* unlikely to ever be changed, even thinking
+> 30 years into the future ?
+> 
+> (b) if the current M-mode (user app level) context is "RV Base ISA 1"
+> then i would hazard a guess that S-mode is prettty much going to drop
+> down into *exactly* the same mode / context, the majority of the time
+> 
+> thus the hypothesis is that not only is it the common code-path to *not*
+> switch the ISA in the S-mode trap but that the instructions used are
+> extremely unlikely to be changed between "RV Base Revisions".
+> 
+> foreign isa hardware-level execution
+> ------------------------
+> 
+> this is the one i've not really thought through so much, other than it
+> would clearly be disadvantageous for S-mode to be arbitrarily restricted
+> to running RV-Base code (of any variant).  a case could be made that by the
+> time the m-a-i tuple is switched to the foreign isa it's "all bets off",
+> foreign arch is "on its own", including having to devise a means and
+> method to switch back (equivalent in its ISA of m-a-i switching).
+> 
+> conclusion / idea
+> --------------------
+> 
+> the multi-base "user wants to run one and only one tuple" is the key
+> case, here, that is a show-stopper to the idea of hard-wiring the default
+> S-mode m-a-i.
+> 
+> now, if instead we were to say, "ok so there should be a default S-mode
+> m-a-i tuple" and it was permitted to SET (choose) that tuple, *that*
+> would solve that problem.  it could even be set to the foreign isa. 
+> which would be hilarious.
+
+jacob's idea: one hart, one configuration:
+
+>>>  (a) RV-Base ISA, particularly code-execution in the critical S-mode
+>>> trap-handling, is *EXTREMELY* unlikely to ever be changed, even
+>>> thinking 30 years into the future ?
+>>
+>> Oddly enough, due to the minimalism of RISC-V, I believe that this is
+>> actually quite likely.  :-)
+>>
+>>>  thus the hypothesis is that not only is it the common code-path to
+>>> *not* switch the ISA in the S-mode trap but that the instructions used
+>>> are extremely unlikely to be changed between "RV Base Revisions".
+>>>
+>> Correct.  I argue that S-mode should *not* be able to switch the selected
+>> ISA on multi-arch processors. 
+>
+> that would produce an artificial limitation which would prevent
+> and prohibit implementors from making a single-core (single-hart)
+> multi-configuration processor.
+
+
+
  # Summary and Conclusion
  
  In the early sections (those in the category "no action") it was established
@@ -525,7 +551,36 @@ The following conversation exerpts are taken from the ISA-dev discussion
  > it is implementing. It will test nothing in the custom extension space,
  > and doesn't monitor or care what is in that space.
  
+## (4) Jacob Bachmeyer on explaining disambiguation of opcode space
+
+> ...have different harts with different sets of encodings.)  Adding a "select"
+> CSR as has been proposed does not escape this fundamental truth that
+> instruction decode must be unambiguous, it merely expands every opcode with
+> extra bits from a "select" CSR.
+
+## (5) Krste Asanovic on clarification of use of opcode space
+
+> A CPU is even free to reuse some standard extension encoding space for
+> non-standard extensions provided it does not claim to implement that
+> standard extension.
+
+## (6) Clarification of difference between assembler and encodings
+
+> > The extensible assembler database I proposed assumes that each processor
+> > will have *one* and *only* one set of recognized instructions.  (The "hidden
+> > prefix" is the immutable vendor/arch/impl tuple in my proposals.) 
+>
+>  ah this is an extremely important thing to clarify, the difference
+> between the recognised instruction assembly mnemonic (which must be
+> globally world-wide accepted as canonical) and the binary-level encodings
+> of that mnemonic used different vendor implementations which will most
+> definitely *not* be unique but require "registration" in the form of
+> atomic acceptance as a patch by the FSF to gcc and binutils [and other
+> compiler tools].
+
+
  # References
  
  * <https://groups.google.com/a/groups.riscv.org/forum/#!topic/isa-dev/7bbwSIW5aqM>
  * <https://groups.google.com/a/groups.riscv.org/forum/#!topic/isa-dev/InzQ1wr_3Ak%5B1-25%5D>
+* Review mvendorid-marchid WARL <https://groups.google.com/a/groups.riscv.org/forum/#!topic/isa-dev/Uvy9paXN1xA>