From bc8cb5f14c11d3bb2030675c2877bdf75aeba00b Mon Sep 17 00:00:00 2001 From: lkcl Date: Mon, 4 Apr 2022 13:17:52 +0100 Subject: [PATCH] --- openpower/sv/cr_int_predication.mdwn | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/openpower/sv/cr_int_predication.mdwn b/openpower/sv/cr_int_predication.mdwn index d3c9d9f6a..aca3df659 100644 --- a/openpower/sv/cr_int_predication.mdwn +++ b/openpower/sv/cr_int_predication.mdwn @@ -14,18 +14,22 @@ See: Rationale: -Condition Registers are conceptually perfect for use as predicate masks, the only problem being that typical Vector ISAs have quite comprehensive mask-based instructions: set-before-first, popcount and much more. In fact many Vector ISAs can use Vectors *as* masks, consequently the entire Vector ISA is available for use in creating masks. This is not practical for SV given the strategy of leveraging pre-existing Scalar instructions in a minimalist way. +Condition Registers are conceptually perfect for use as predicate masks, the only problem being that typical Vector ISAs have quite comprehensive mask-based instructions: set-before-first, popcount and much more. In fact many Vector ISAs can use Vectors *as* masks, consequently the entire Vector ISA is usually available for use in creating masks (one exception being AVX512 which +has a dedicated Mask regfile and opcodes). +Duplication of such operations (popcount etc) is not practical for SV given +the strategy of leveraging pre-existing Scalar instructions in a minimalist way. With the scalar OpenPOWER v3.0B ISA having already popcnt, cntlz and others normally seen in Vector Mask operations it makes sense to allow *both* scalar integers *and* CR-Vectors to be predicate masks. That in turn means that much more comprehensive interaction between CRs and scalar Integers is required, because with the CR Predication Modes designating CR *Fields* (not CR bits) as Predicate Elements, fast transfers between CR *Fields* and the Integer Register File is needed. The opportunity is therefore taken to also augment CR logical arithmetic as well, using a mask-based paradigm that takes into consideration multiple bits of each CR Field (eq/lt/gt/ov). By contrast -v3.0B Scalar CR instructions (crand, crxor) only allow a single bit calculation. +v3.0B Scalar CR instructions (crand, crxor) only allow a single bit calculation, and both mtcr and mfcr are CR-orientated rather than CR *Field* +orientated. Also strangely there is no v3.0 instruction for moving CR Fields, so that is corrected here with `mcrfm`. The opportunity is taken -to allow inversion of CR Fields when copied. +to allow inversion of CR Field bits, when copied. Basic concept: -- 2.30.2