X-Git-Url: https://git.libre-soc.org/?a=blobdiff_plain;f=simple_v_extension.mdwn;h=f4b7093598285247931ff58dd886e960e49f1f36;hb=c30746aab8a59d100d80b2ee104cf956b67dbbeb;hp=803bdba02313abd434003b79facf026cd636d4c2;hpb=91c21db2f5dcf2e43068c7aa45535228ed8c7dc0;p=libreriscv.git diff --git a/simple_v_extension.mdwn b/simple_v_extension.mdwn index 803bdba02..f4b709359 100644 --- a/simple_v_extension.mdwn +++ b/simple_v_extension.mdwn @@ -12,6 +12,10 @@ of Out-of-order restructuring (including parallel ALU lanes) or VLIW implementations, or SIMD, or anything else, would then benefit from the uniformity of a consistent API. +**No arithmetic operations are added or required to be added.** SV is purely a parallelism API and consequentially is suitable for use even with RV32E. + +Talk slides: + [[!toc ]] # Introduction @@ -1059,7 +1063,7 @@ Similar rules apply to the destination register. * Throw an exception. Whether that actually results in spawning threads as part of the trap-handling remains to be seen. -# Under consideration +# Under consideration From the Chennai 2018 slides the following issues were raised. Efforts to analyse and answer these questions are below. @@ -1502,37 +1506,35 @@ the question is asked "How can each of the proposals effectively implement ### Example Instruction translation: -Instructions "ADD r2 r4 r4" would result in three instructions being -generated and placed into the FIFO: +Instructions "ADD r7 r4 r4" would result in three instructions being +generated and placed into the FIFO. r7 and r4 are marked as "vectorised": + +* ADD r7 r4 r4 +* ADD r8 r5 r5 +* ADD r9 r6 r6 -* ADD r2 r4 r4 -* ADD r2 r5 r5 -* ADD r2 r6 r6 +Instructions "ADD r7 r4 r1" would result in three instructions being +generated and placed into the FIFO. r7 and r1 are marked as "vectorised" +whilst r4 is not: + +* ADD r7 r4 r1 +* ADD r8 r4 r2 +* ADD r9 r4 r3 ## Example of vector / vector, vector / scalar, scalar / scalar => vector add - register CSRvectorlen[XLEN][4]; # not quite decided yet about this one... - register CSRpredicate[XLEN][4]; # 2^4 is max vector length - register CSRreg_is_vectorised[XLEN]; # just for fun support scalars as well - register x[32][XLEN]; - - function op_add(rd, rs1, rs2, predr) - { -    /* note that this is ADD, not PADD */ -    int i, id, irs1, irs2; -    # checks CSRvectorlen[rd] == CSRvectorlen[rs] etc. ignored -    # also destination makes no sense as a scalar but what the hell... -    for (i = 0, id=0, irs1=0, irs2=0; i @@ -2344,3 +2346,5 @@ TBD: floating-point compare and other exception handling * * Full Description (last page) of RVV instructions +* PULP Low-energy Cluster Vector Processor +