From 3f41923fc20c62146fac85652f155aacfe1a049a Mon Sep 17 00:00:00 2001 From: lkcl Date: Sat, 15 Apr 2023 23:49:30 +0100 Subject: [PATCH] --- openpower/sv/remap.mdwn | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/openpower/sv/remap.mdwn b/openpower/sv/remap.mdwn index 4d50fc14e..1d5d94027 100644 --- a/openpower/sv/remap.mdwn +++ b/openpower/sv/remap.mdwn @@ -124,9 +124,11 @@ The following illustrative example multiplies a 3x4 and a 5x3 matrix to create a 5x4 result: - svshape 5, 4, 3, 0, 0 +``` + svshape 5, 4, 3, 0, 0 # Outer Product svremap 15, 1, 2, 3, 0, 0, 0, 0 - sv.fmadds *0, *8, *16, *0 + sv.fmadds *0, *16, *32, *0 +``` * svshape sets up the four SVSHAPE SPRS for a Matrix Schedule * svremap activates four out of five registers RA RB RC RT RS (15) @@ -136,13 +138,14 @@ a 5x4 result: - RC to use SVSHAPE3 - RT to use SVSHAPE0 - RS Remapping to not be activated -* sv.fmadds has RT=0.v, RA=8.v, RB=16.v, RC=0.v +* sv.fmadds has Vectors at RT=0, RA=16, RB=32, RC=0 * With REMAP being active each register's element index is *independently* transformed using the specified SHAPEs. Thus the Vector Loop is arranged such that the use of the multiply-and-accumulate instruction executes precisely the required -Schedule to perform an in-place in-registers Matrix Multiply with no +Schedule to perform an in-place in-registers Outer Product +Matrix Multiply with no need to perform additional Transpose or register copy instructions. The example above may be executed as a unit test and demo, [here](https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/decoder/isa/test_caller_svp64_matrix.py;h=c15479db9a36055166b6b023c7495f9ca3637333;hb=a17a252e474d5d5bf34026c25a19682e3f2015c3#l94) -- 2.30.2