From 113edd0aa326382a4b93d8dfc770afa4c2fce6cb Mon Sep 17 00:00:00 2001
From: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
Date: Thu, 20 Dec 2018 08:01:54 +0000
Subject: [PATCH] clarify

---
 updates/005_2018dec14_simd_without_simd.mdwn | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/updates/005_2018dec14_simd_without_simd.mdwn b/updates/005_2018dec14_simd_without_simd.mdwn
index c6e456c..6548729 100644
--- a/updates/005_2018dec14_simd_without_simd.mdwn
+++ b/updates/005_2018dec14_simd_without_simd.mdwn
@@ -143,8 +143,18 @@ anyway, for 3D, so if 64-bit operations happen to have half the number of
 Reservation Stations / Function Units, and block more often, we actually
 don't mind so much.  Also, we can still apply the same "banks" trick on
 the Register File, except this time with 4-way multiplexing on 32-bit
-wide banks, and 4x4 crossbars on the bytes:
+wide banks, and 4x4 crossbars on the bytes as well:
 
 {{register_file_multiplexing.jpg}}
 
+To cope with 16-bit operations, pairs of 8-bit values in adjacent Function
+Units are reserved.  Likewise for 64-bit operations, the 8-bit crossbars
+are not used, and pairs of 32-bit source values in adjacent Function Units
+in the *32-bit* FU area are reserved.
 
+However, the gate count in such a staggered crossbar arrangement is insane:
+bear in mind that this will be 3R1W or 2R1W (2 or 3 reads, 1 write per
+register), and that means **three** sets of crossbars, comprising **four**
+banks, with effectively 16 byte to 16 byte routing.
+
+It's too much - so in later updates, this will be explored further.
-- 
2.30.2