X-Git-Url: https://git.libre-soc.org/?a=blobdiff_plain;f=simple_v_extension.mdwn;h=723cc07259960db0b00ed61b10873a310b9e3ab8;hb=a5eb28aa04e55ddd67ebe0841a290e2fb1400ef3;hp=0c0dde765ec4e8369be465ef8aece0fa08be86b6;hpb=5117d0ca405d7ee5a1dd4f3c61ea9795ef9a15ef;p=libreriscv.git
diff --git a/simple_v_extension.mdwn b/simple_v_extension.mdwn
index 0c0dde765..723cc0725 100644
--- a/simple_v_extension.mdwn
+++ b/simple_v_extension.mdwn
@@ -1,14 +1,5 @@
# Variable-width Variable-packed SIMD / Simple-V / Parallelism Extension Proposal
-* TODO 23may2018: CSR-CAM-ify regfile tables
-* TODO 23may2018: zero-mark predication CSR
-* TODO 28may2018: sort out VSETVL: CSR length to be removed?
-* TODO 09jun2018: Chennai Presentation more up-to-date
-* TODO 09jun2019: elwidth only 4 values (dflt, dflt/2, 8, 16)
-* TODO 09jun2019: extra register banks (future option)
-* TODO 09jun2019: new Reg CSR table (incl. packed=Y/N)
-
-
Key insight: Simple-V is intended as an abstraction layer to provide
a consistent "API" to parallelisation of existing *and future* operations.
*Actual* internal hardware-level parallelism is *not* required, such
@@ -1068,7 +1059,7 @@ Similar rules apply to the destination register.
* Throw an exception. Whether that actually results in spawning threads
as part of the trap-handling remains to be seen.
-# Under consideration
+# Under consideration
From the Chennai 2018 slides the following issues were raised.
Efforts to analyse and answer these questions are below.
@@ -1511,37 +1502,35 @@ the question is asked "How can each of the proposals effectively implement
### Example Instruction translation:
-Instructions "ADD r2 r4 r4" would result in three instructions being
-generated and placed into the FIFO:
+Instructions "ADD r7 r4 r4" would result in three instructions being
+generated and placed into the FIFO. r7 and r4 are marked as "vectorised":
+
+* ADD r7 r4 r4
+* ADD r8 r5 r5
+* ADD r9 r6 r6
-* ADD r2 r4 r4
-* ADD r2 r5 r5
-* ADD r2 r6 r6
+Instructions "ADD r7 r4 r1" would result in three instructions being
+generated and placed into the FIFO. r7 and r1 are marked as "vectorised"
+whilst r4 is not:
+
+* ADD r7 r4 r1
+* ADD r8 r4 r2
+* ADD r9 r4 r3
## Example of vector / vector, vector / scalar, scalar / scalar => vector add
- register CSRvectorlen[XLEN][4]; # not quite decided yet about this one...
- register CSRpredicate[XLEN][4]; # 2^4 is max vector length
- register CSRreg_is_vectorised[XLEN]; # just for fun support scalars as well
- register x[32][XLEN];
-
- function op_add(rd, rs1, rs2, predr)
- {
- Â Â /* note that this is ADD, not PADD */
- Â Â int i, id, irs1, irs2;
- Â Â # checks CSRvectorlen[rd] == CSRvectorlen[rs] etc. ignored
- Â Â # also destination makes no sense as a scalar but what the hell...
- Â Â for (i = 0, id=0, irs1=0, irs2=0; i
@@ -2353,3 +2342,5 @@ TBD: floating-point compare and other exception handling
*
* Full Description (last page) of RVV instructions
+* PULP Low-energy Cluster Vector Processor
+