From: lkcl Date: Sat, 8 Apr 2023 12:41:24 +0000 (+0100) Subject: (no commit message) X-Git-Tag: opf_rfc_ls012_v1~69 X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=e75e70fd980fff14260dd28325dc063b1d10e303;p=libreriscv.git --- diff --git a/openpower/sv/rfc/ls012.mdwn b/openpower/sv/rfc/ls012.mdwn index 2702f6297..6d620a66e 100644 --- a/openpower/sv/rfc/ls012.mdwn +++ b/openpower/sv/rfc/ls012.mdwn @@ -204,6 +204,32 @@ a phenomental 35 instructions with *six branches* to emulate in Power ISA! For desktop as well as Server HTML/JS back-end execution of javascript this becomes an obvious priority, recognised already by ARM as just one example. +## Bitmanip LUT2/3 + +These LUT2/3 operations are high cost high reward. Outlined in [[sv/bitmanip]], +the simplest ones already exist in PackedSIMD VSX: `xxeval`. +The same reasoning applies as to fclass: SFFS needs to be stand-alone on its +own merits and not "punished" should an implementor choose not to implement +any aspect of PackedSIMD VSX. + +With Predication being such a high priority in GPUs and HPC, CR Field variants +of Ternary and Binary LUT instructions were considered high priority, and again +just like in the CRweird group the opportunity was taken to work on *all* +bits of a CR Field rather than just one bit as is done with the existing CR operations +crand, cror etc. + +The other high strategic value instruction is `grevlut` (and `grevluti` which can +generate a remarkably large number of regular-patterned magic constants). +The grevlut set require of the order of 20,000 gates but provide an astonishing +plethora of innovative bit-permuting instructions never seen in any other ISA. + +The downside of all of these instructions is the extremely low XO bit requirements: +2-3 bit XO due to the large immediates *and* the number of operands required. +The LUT3 instructions are already compacted down to "Overwrite" variants. +Realistically these high-value instructions should be proposed in EXT2xx where +their XO cost does not overwhelm EXT0xx. + + ## (f)mv.swizzle [[sv/mv.swizzle]] is dicey. It is a 2-in 2-out operation whose value as a Scalar