008ab56f92e45b678bfb91bfd5ebcd591b5c5d0e
[libreriscv.git] / openpower / sv / rfc / ls004.mdwn
1 # RFC ls004 Shift-And-Add
2
3 **URLs**:
4
5 * <https://libre-soc.org/openpower/sv/biginteger/analysis/>
6 * <https://libre-soc.org/openpower/sv/rfc/ls004/>
7 * bigint: <https://bugs.libre-soc.org/show_bug.cgi?id=960> TODO: maybe remove this link due to confusion and irrelevance?
8 * <https://git.openpower.foundation/isa/PowerISA/issues/91>
9 * shift-and-add <https://bugs.libre-soc.org/show_bug.cgi?id=968>
10 * add shaddw: <https://bugs.libre-soc.org/show_bug.cgi?id=996>
11
12 **Severity**: Major
13
14 **Status**: New
15
16 **Date**: 31 Oct 2022
17
18 **Target**: v3.2B
19
20 **Source**: v3.0B
21
22 **Books and Section affected**:
23
24 ```
25 Book I Fixed-Point Shift Instructions 3.3.14.2
26 Appendix E Power ISA sorted by opcode
27 Appendix F Power ISA sorted by version
28 Appendix G Power ISA sorted by Compliancy Subset
29 Appendix H Power ISA sorted by mnemonic
30 ```
31
32 **Summary**
33
34 ```
35 Instructions added
36 shadd - Shift and Add
37 shadduw - Shift and Add Unsigned Word
38 ```
39
40 **Submitter**: Luke Leighton (Libre-SOC)
41
42 **Requester**: Libre-SOC
43
44 **Impact on processor**:
45
46 ```
47 Addition of two new GPR-based instructions
48 ```
49
50 **Impact on software**:
51
52 ```
53 Requires support for new instructions in assembler, debuggers,
54 and related tools.
55 ```
56
57 **Keywords**:
58
59 ```
60 GPR, Big-manip, Shift, Arithmetic
61 ```
62
63 **Motivation**
64
65 Power ISA is missing LD/ST with shift, which is present in both ARM and x86.
66 Adding more LD/ST is too complex, a compromise is to add shift-and-add.
67 Replaces a pair of explicit instructions in hot-loops.
68
69 **Notes and Observations**:
70
71 1. `shadd` and `shadduw` operate on unsigned integers.
72 2. `shadduw` is intended for performing address offsets,
73 as the second operand is constrained to lower 32-bits
74 and zero-extended.
75 3. Both are 2-in 1-out instructions.
76
77 TODO: signed 32-bit shift-and-add should be added, this needs to be addressed before submitting the RFC: <https://bugs.libre-soc.org/show_bug.cgi?id=996>
78
79 **Changes**
80
81 Add the following entries to:
82
83 * the Appendices of Book I
84 * Instructions of Book I added to Section 3.3.14.2
85
86 ----------------
87
88 \newpage{}
89
90 # Shift-and-Add
91
92 `shadd RT, RA, RB`
93
94 | 0-5 | 6-10 | 11-15 | 16-20 | 21-22 | 23-30 | 31 | Form |
95 |-------|------|-------|-------|-------|-------|----|----------|
96 | PO | RT | RA | RB | sm | XO | Rc | Z23-Form |
97
98 Pseudocode:
99
100 shift <- sm + 1 # Shift is between 1-4
101 sum[0:63] <- ((RB) << shift) + (RA) # Shift RB, add RA
102 RT <- sum # Result stored in RT
103
104 When `sm` is zero, the contents of register RB are multiplied by 2,
105 added to the contents of register RA, and the result stored in RT.
106
107 `sm` is a 2-bit bitfield, and allows multiplication of RB by 2, 4, 8, 16.
108
109 Operands RA and RB, and the result RT are all 64-bit, unsigned integers.
110
111 **NEED EXAMPLES (not sure how to embedd sm)!!!**
112 Examples:
113
114 ```
115 # adds r1 to (r2*8)
116 shadd r4, r1, r2, 3
117 ```
118
119 # Shift-and-Add Unsigned Word
120
121 `shadd RT, RA, RB`
122
123 | 0-5 | 6-10 | 11-15 | 16-20 | 21-22 | 23-30 | 31 | Form |
124 |-------|------|-------|-------|-------|-------|----|----------|
125 | PO | RT | RA | RB | sm | XO | Rc | Z23-Form |
126
127 Pseudocode:
128
129 shift <- sm + 1 # Shift is between 1-4
130 n <- (RB)[32:63] # Only use lower 32-bits of RB
131 sum[0:63] <- (n << shift) + (RA) # Shift n, add RA
132 RT <- sum # Result stored in RT
133
134 When `sm` is zero, the lower word contents of register RB are multiplied by 2,
135 added to the contents of register RA, and the result stored in RT.
136
137 `sm` is a 2-bit bitfield, and allows multiplication of RB by 2, 4, 8, 16.
138
139 Operands RA and RB, and the result RT are all 64-bit, unsigned integers.
140
141 *Programmer's Note:
142 The advantage of this instruction is doing address offsets. RA is the base 64-bit
143 address. RB is the offset into data structure limited to 32-bit.
144
145 Examples:
146
147 ```
148 #
149 shadduw r4, r1, r2
150 ```
151
152
153 [[!tag opf_rfc]]
154
155 # Appendices
156
157 Appendix E Power ISA sorted by opcode
158 Appendix F Power ISA sorted by version
159 Appendix G Power ISA sorted by Compliancy Subset
160 Appendix H Power ISA sorted by mnemonic
161
162 | Form | Book | Page | Version | mnemonic | Description |
163 |------|------|------|---------|----------|-------------|
164 | Z23 | I | # | 3.0B | shadd | Shift-and-Add |
165 | Z23 | I | # | 3.0B | shadduw | Shift-and-Add Unsigned Word |
166