to use register 112, for example. One of those could even be changed
to 32-bit operations whilst the other is set to 16-bit element widths.
-Our initial thoughts were to try a standard simple in-order SIMD architecture,
+Our initial thoughts advocated a standard simple in-order SIMD architecture,
with predication bits passed down into the SIMD ALUs. If a bit is "off",
that "lane" within the ALU does not calculate a result, saving power.
-However, a pre-analysis engine is required that re-orders the registers,
-packs lanes of data together so that it fits into one SIMD ALU, and, on
+However, in SV, when the element width is set to 32, 16 or 8-bit, a
+pre-issue engine is required that re-orders *parts* of the registers,
+packing lanes of data together so that it fits into one SIMD ALU, and, on
exit from the ALU, it may be necessary to split and "redirect" parts of the
data to *multiple* actual 64-bit registers. In other words, bit-level
(or byte-level) manipulation is required, both pre- and post- ALU.
elements to "lanes", and if a predication bit is not set, the lane
runs "empty". By contrast, with the multi-issue execution model, an
operation that is predicated out means that the element-based instruction
-does not even make it into the instruction queue. Thus, unlike in a
+does not even make it into the instruction queue, leaving it free for
+use by following instructions, even in the same cycle, and even if the
+operation is totally different. Thus, unlike in a
traditional vectore architecture, ALUs may be occupied by elements from
-other "Lanes", because of the pre-existing decoupling between the multi-issue
-instruction queue and the ALUs.
+other "Lanes", because the pre-existing decoupling between the multi-issue
+instruction queue and the ALUs is efficiently leveraged.
Simple!
-[[reorder_buffer.jpg]]
-
There are many other benefits to a multi-issue microarchitecture, and
these are being discussed
[here](http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2018-December/000198.html)
commonly taught in Universities, and, secondly, patents on the algorithm
have long since expired.
+[[reorder_buffer.jpg]]
+
Also, there are both memory hazards and register hazards that a Reorder
Buffer augmented Tomasulo algorithm takes care of, whilst also allowing
for branch prediction and really simple roll-back, preservation of
We also may need to have simple Branch Prediction, because some of the
loops in [Kazan](https://salsa.debian.org/Kazan-team/kazan/) are particularly
tight. A Reorder Buffer can easily be used to implement Branch Prediction,
-because, just as with an Exception, the ROB needs to be cleared out
+because, just as with an Exception, the ROB can to be cleared out
(flushed) if the branch is mispredicted. As it is necessary to respect
Exceptions, the logic has to exist to clear out the ROB: Branch Prediction
simply uses this pre-existing logic.
* There's no clear way to handle branch prediction, where the Reorder
Buffer of Tomasulo handles it really cleanly.
+However there are downsides to Reorder Buffers:
+
+* The Common Data Bus may become a serious bottleneck, as it delivers
+ data from multiple ALUs which may be generating results simultaneously.
+ To keep up with result generation, *multiple* CDBs may be needed, which
+ results in each receiver having multiple ports
+* The Destination field in the ROB has to act as a key in a CAM (Content
+ Addressble Memory). As a result, power consumption of the ROB may be
+ quite high. It may or may not be possible to reduce power consumption
+ by testing an "active" bitfield (separate from but augmenting the ROB)
+ to indicate whether Destination Registers are in use. If inactive,
+ the CAM lookup need not take place.
+
Whilst nothing's firmly set in stone, here, as we have a Charter that
requires unanimous decision-making from contributors, so far it's leaning
towards Reorder Buffers and Tomasulo as a good, clean fit. In part that
is down to more research having been done on that particular algorithm.
+For completeness, scoreboarding and explicit register renaming need
+to be properly and comprehensively investigted.
More as it happens...