-It should be clear that a 4x4 by 4x4 Matrix Multiply, being effectively
-the same technique applied to four independent vectors, can be done by
-setting VL=64, using an extra dimension on the SHAPE0 and SHAPE1 SPRs,
-and applying a rotating 1D SHAPE SPR of xdim=16 to f8 in order to get
-it to apply four times to compute the four columns worth of vectors.
+The only other instruction required is to ensure that f4-f7 are
+initialised (usually to zero) however obviously if used as part
+of some other computation, which is frequently the case, then
+clearly the zeroing is not needed.