From 10e56a569927d8be99e12965508d6f864f1086e4 Mon Sep 17 00:00:00 2001 From: Luke Kenneth Casson Leighton Date: Wed, 19 Dec 2018 18:13:20 +0000 Subject: [PATCH] minor correction --- updates/003_2018dec04_microarchitecture.mdwn | 14 +++++++------- updates/005_2018dec14_simd_without_simd.mdwn | 8 +++++++- 2 files changed, 14 insertions(+), 8 deletions(-) diff --git a/updates/003_2018dec04_microarchitecture.mdwn b/updates/003_2018dec04_microarchitecture.mdwn index b1735c5..84bb5b0 100644 --- a/updates/003_2018dec04_microarchitecture.mdwn +++ b/updates/003_2018dec04_microarchitecture.mdwn @@ -83,13 +83,13 @@ and, crucially, some ALUs may take longer than others, and the algorithm simply does not care. In addition, there may be a really simple way to extend the Reorder Buffer tags to accomodate SIMD-style characteristics. -We also may need to have simple Branch Prediction, because some of the -loops in [Kazan](https://salsa.debian.org/Kazan-team/kazan/) are particularly -tight. A Reorder Buffer can easily be used to implement Branch Prediction, -because, just as with an Exception, the ROB can to be cleared out -(flushed) if the branch is mispredicted. As it is necessary to respect -Exceptions, the logic has to exist to clear out the ROB: Branch Prediction -simply uses this pre-existing logic. +We also may need to have simple Branch Prediction, because some of +the loops in [Kazan](https://salsa.debian.org/Kazan-team/kazan/) +are particularly tight. A Reorder Buffer (ROB) can easily be used +to implement Branch Prediction, because, just as with an Exception, +the ROB can to be cleared out (flushed) if the branch is mispredicted. +As it is necessary to respect Exceptions, the logic has to exist to +clear out the ROB: Branch Prediction simply uses this pre-existing logic. The other way in which out-of-order execution can be handled is called scoreboarding, as well as explicit register renaming. These schemes diff --git a/updates/005_2018dec14_simd_without_simd.mdwn b/updates/005_2018dec14_simd_without_simd.mdwn index fb75a04..c6e456c 100644 --- a/updates/005_2018dec14_simd_without_simd.mdwn +++ b/updates/005_2018dec14_simd_without_simd.mdwn @@ -141,4 +141,10 @@ from *both* FUs. The primary focus is on 32-bit (single-precision floating-point) performance anyway, for 3D, so if 64-bit operations happen to have half the number of Reservation Stations / Function Units, and block more often, we actually -don't mind so much. +don't mind so much. Also, we can still apply the same "banks" trick on +the Register File, except this time with 4-way multiplexing on 32-bit +wide banks, and 4x4 crossbars on the bytes: + +{{register_file_multiplexing.jpg}} + + -- 2.30.2