From bc8cb5f14c11d3bb2030675c2877bdf75aeba00b Mon Sep 17 00:00:00 2001
From: lkcl <lkcl@web>
Date: Mon, 4 Apr 2022 13:17:52 +0100
Subject: [PATCH]

---
 openpower/sv/cr_int_predication.mdwn | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/openpower/sv/cr_int_predication.mdwn b/openpower/sv/cr_int_predication.mdwn
index d3c9d9f6a..aca3df659 100644
--- a/openpower/sv/cr_int_predication.mdwn
+++ b/openpower/sv/cr_int_predication.mdwn
@@ -14,18 +14,22 @@ See:
 
 Rationale:
 
-Condition Registers are conceptually perfect for use as predicate masks, the only problem being that typical Vector ISAs have quite comprehensive mask-based instructions: set-before-first, popcount and much more.  In fact many Vector ISAs can use Vectors *as* masks, consequently the entire Vector ISA is available for use in creating masks.  This is not practical for SV given the strategy of leveraging pre-existing Scalar instructions in a minimalist way.
+Condition Registers are conceptually perfect for use as predicate masks, the only problem being that typical Vector ISAs have quite comprehensive mask-based instructions: set-before-first, popcount and much more.  In fact many Vector ISAs can use Vectors *as* masks, consequently the entire Vector ISA is usually available for use in creating masks (one exception being AVX512 which
+has a dedicated Mask regfile and opcodes).
+Duplication of such operations (popcount etc) is not practical for SV given
+the strategy of leveraging pre-existing Scalar instructions in a minimalist way.
 
 With the scalar OpenPOWER v3.0B ISA having already popcnt, cntlz and others normally seen in Vector Mask operations it makes sense to allow *both* scalar integers *and* CR-Vectors to be predicate masks.  That in turn means that much more comprehensive interaction between CRs and scalar Integers is required, because with the CR Predication Modes designating CR *Fields*
 (not CR bits) as Predicate Elements, fast transfers between CR *Fields* and
 the Integer Register File is needed. 
 
 The opportunity is therefore taken to also augment CR logical arithmetic as well, using a mask-based paradigm that takes into consideration multiple bits of each CR Field (eq/lt/gt/ov).  By contrast
-v3.0B Scalar CR instructions (crand, crxor) only allow a single bit calculation.
+v3.0B Scalar CR instructions (crand, crxor) only allow a single bit calculation, and both mtcr and mfcr are CR-orientated rather than CR *Field*
+orientated.
 
 Also strangely there is no v3.0 instruction for moving CR Fields,
 so that is corrected here with `mcrfm`. The opportunity is taken
-to allow inversion of CR Fields when copied.
+to allow inversion of CR Field bits, when copied.
 
 Basic concept:
 
-- 
2.30.2