Unlike the parallel case, A is not itself partitioned, so is copied
over as much as is possible. In some cases such as `1x 4-bit, 1x 12-bit`
-the 8-bit scalar source will need sign or zero extending.
+(partition mask = `0b100`, above) the 8-bit scalar source will need sign or zero extending.