ls011: Remove duplicate Fixed-point Load with Post-Update entries
[libreriscv.git] / openpower / sv / rfc / ls011.mdwn
1 # RFC ls011 LD/ST-Update-PostIncrement
2
3 * Funded by NLnet under the Privacy and Enhanced Trust Programme, EU
4 Horizon2020 Grant 825310, and NGI0 Entrust No 101069594
5 * <https://bugs.libre-soc.org/show_bug.cgi?id=1048>
6 * <https://libre-soc.org/openpower/sv/rfc/ls011/>
7 * <https://bugs.libre-soc.org/show_bug.cgi?id=1045>
8 * <https://git.openpower.foundation/isa/PowerISA/issues/TODO>
9
10 **Severity**: Major
11
12 **Status**: New
13
14 **Date**: 21 Apr 2023.
15
16 **Target**: v3.2B
17
18 **Source**: v3.0B
19
20 **Books and Section affected**:
21
22 ```
23 Chapter 2 Book I, new Fixed-Point Load / Store Sections 3.3.2 3.3.3
24 Chapter 4 Book I, new Floating-Point Load / Store Sections 4.6.2 4.6.3
25 ```
26
27 **Summary**
28
29 ```
30 TODO
31 ```
32
33 **Submitter**: Luke Leighton (Libre-SOC)
34
35 **Requester**: Libre-SOC
36
37 **Impact on processor**:
38
39 ```
40 Addition of new Load/Store Fixed and Floating Point instructions
41 ```
42
43 **Impact on software**:
44
45 ```
46 Requires support for new instructions in assembler, debuggers, and related tools.
47 Reduces instructions in hot-loops
48 ```
49
50 **Keywords**:
51
52 ```
53
54 ```
55
56 **Motivation**
57
58 Moving the update of RA to *after* the Memory operation saves on instruction count
59 both outside and inside hot-loops. strncpy may be reduced to 11 Vector instructions,
60 3 of which are the zeroing loop, 5 of which are the copy. Percentage-wise LD/ST
61 Update Post-Increment represents a massive 20% reduction.
62
63 **Notes and Observations**:
64
65 These types of instructions are already present in x86 (sort-of).
66
67 * x86 chose that store should be pre-indexed and load should be post-indexed
68 * Power ISA chose everything to be pre-indexed
69 * Motorola 68000 (decades old) has pre- and post- indexed
70
71 <https://tack.sourceforge.net/olddocs/m68020.html#2.2.2.%20Extra%20MC68020%20addressing%20modes>
72
73 <https://azeria-labs.com/memory-instructions-load-and-store-part-4/>
74
75 **Changes**
76
77 Add the following entries to:
78
79 * New Load/Store Sections
80 * Appendices
81
82 [[!tag opf_rfc]]
83
84 --------
85
86 \newpage{}
87
88 TODO (key stub notes below)
89
90
91
92 The LD/ST-Immediate-Post-Increment instructions are all Primary
93 Opcode: there are 13 of these. LD/ST-Indexed-Post-Increment
94 are all effectively 9-bit XO and consequently may easily
95 fit into one single Primary Opcode. EXT2xx is recommended.
96
97 One alternative idea is that bit 31 could be allocated (retrospectively)
98 to Post-Increment. Although it may be too late for Scalar Power ISA
99 it **may** be possible to consider for SVP64Single and/or SVP64-Vector,
100 but this risks creating a non-Orthogonal ISA.
101
102
103
104 ```
105 # LD/ST-Postincrement
106 lbzup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W
107 lbzupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W
108 lhzup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W
109 lhzupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W
110 lhaup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W
111 lhaupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W
112 lwzup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W
113 lwzupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W
114 lwaupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W
115 ldup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W
116 ldupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W
117 stbup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedstore, 2R1W
118 stbupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W
119 sthup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedstore, 2R1W
120 sthupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W
121 stwup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedstore, 2R1W
122 stwupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W
123 stdup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedstore, 2R1W
124 stdupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W
125
126 # FP LD/ST-Postincrement
127 lfdup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W
128 lfsup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W
129 lfdupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W
130 lsdupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W
131 stfdup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedstore, 2R1W
132 stfsup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedstore, 2R1W
133 stfdupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W
134 stfsupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W
135
136 # LD/ST-Shifted-Postincrement
137 lbzupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W
138 lhzupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W
139 lhaupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W
140 lwzupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W
141 lwaupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W
142 ldupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W
143 stbupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 3R1W
144 sthupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 3R1W
145 stwupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 3R1W
146 stdupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 3R1W
147
148 # FP LD/ST-Shifted-Postincrement
149 lfdupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W
150 lfsupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W
151 stfdupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 3R1W
152 stfsupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 3R1W
153
154 ```
155
156 # Example
157
158 Here is an annotated example where the pseudo-code changes to
159 just use `RA` as the address, otherwise remaining the same.
160 No actual change to the Effective Address computation itself
161 occurs, in any of the Post-Update instructions.
162
163 **Load Byte and Zero with Post-Update**
164
165 D-Form
166
167 * lbzup RT,D(RA)
168
169 Pseudo-code:
170
171 ```
172 EA <- (RA) # EA just RA
173 RT <- ([0] * (XLEN-8)) || MEM(EA, 1) # then load
174 RA <- (RA) + EXTS(D) # then update RA after
175 ```
176
177 Special Registers Altered:
178
179 ```
180 None
181 ```
182
183 where the same pseudocode for `lbzu` is:
184
185 ```
186 EA <- (RA) + EXTS(D) # EA includes D
187 RT <- ([0] * (XLEN-8)) || MEM(EA, 1) # load from RA+D
188 RA <- EA # and update RA
189 ```
190 -----
191
192 \newpage{}
193
194 # Fixed-point Load with Post-Update
195
196 Add the following additional Section to Fixed-Point Load: Book I 3.3.2.1
197
198 TODO: move the inline import to pifixedload here... (separate commit).
199
200 -----
201
202 \newpage{}
203
204 # Fixed-Point Store Post-Update
205
206 Add the following as a new section in Fixed-Point Store, Book I
207
208 ## Store Byte with Update
209
210 D-Form
211
212 ```
213 |0 |6 |9 |10 |11 |16 |31 |
214 | PO | RT | RA| D |
215 ```
216
217 * stbup RS,D(RA)
218
219 Pseudo-code:
220
221 ```
222 EA <- (RA) + EXTS(D)
223 ea <- (RA)
224 MEM(ea, 1) <- (RS)[XLEN-8:XLEN-1]
225 RA <- EA
226 ```
227
228 Special Registers Altered:
229
230 None
231
232 ## Store Byte with Update Indexed
233
234 X-Form
235
236 ```
237 |0 |6 |7|8|9 |10 |11|12|13 |15|16|17 |20|21 |31 |
238 | PO | RS | RA | RB | XO | / |
239 ```
240
241 * stbupx RS,RA,RB
242
243 Pseudo-code:
244
245 ```
246 EA <- (RA) + (RB)
247 ea <- (RA)
248 MEM(ea, 1) <- (RS)[XLEN-8:XLEN-1]
249 RA <- EA
250 ```
251
252 Special Registers Altered:
253
254 None
255
256 ## Store Halfword with Update
257
258 D-Form
259
260 ```
261 |0 |6 |9 |10 |11 |16 |31 |
262 | PO | RT | RA| D |
263 ```
264
265 * sthup RS,D(RA)
266
267 Pseudo-code:
268
269 ```
270 EA <- (RA) + EXTS(D)
271 ea <- (RA)
272 MEM(ea, 2) <- (RS)[XLEN-16:XLEN-1]
273 RA <- EA
274 ```
275
276 Special Registers Altered:
277
278 None
279
280 ## Store Halfword with Update Indexed
281
282 X-Form
283
284 ```
285 |0 |6 |7|8|9 |10 |11|12|13 |15|16|17 |20|21 |31 |
286 | PO | RS | RA | RB | XO | / |
287 ```
288
289 * sthupx RS,RA,RB
290
291 Pseudo-code:
292
293 ```
294 EA <- (RA) + (RB)
295 ea <- (RA)
296 MEM(ea, 2) <- (RS)[XLEN-16:XLEN-1]
297 RA <- EA
298 ```
299
300 Special Registers Altered:
301
302 None
303
304 ## Store Word with Update
305
306 D-Form
307
308 ```
309 |0 |6 |9 |10 |11 |16 |31 |
310 | PO | RT | RA| D |
311 ```
312
313 * stwup RS,D(RA)
314
315 Pseudo-code:
316
317 ```
318 EA <- (RA) + EXTS(D)
319 ea <- (RA)
320 MEM(ea, 4) <- (RS)[XLEN-32:XLEN-1]
321 RA <- EA
322 ```
323
324 Special Registers Altered:
325
326 None
327
328 ## Store Word with Update Indexed
329
330 X-Form
331
332 ```
333 |0 |6 |7|8|9 |10 |11|12|13 |15|16|17 |20|21 |31 |
334 | PO | RS | RA | RB | XO | / |
335 ```
336
337 * stwupx RS,RA,RB
338
339 Pseudo-code:
340
341 ```
342 EA <- (RA) + (RB)
343 ea <- (RA)
344 MEM(ea, 4) <- (RS)[XLEN-32:XLEN-1]
345 RA <- EA
346 ```
347
348 Special Registers Altered:
349
350 None
351
352 ## Store Doubleword with Update
353
354 DS-Form
355
356 * stdup RS,DS(RA)
357
358 Pseudo-code:
359
360 ```
361 EA <- (RA) + EXTS(DS || 0b00)
362 ea <- (RA)
363 MEM(ea, 8) <- (RS)
364 RA <- EA
365 ```
366
367 Special Registers Altered:
368
369 None
370
371 ## Store Doubleword with Update Indexed
372
373 X-Form
374
375 ```
376 |0 |6 |7|8|9 |10 |11|12|13 |15|16|17 |20|21 |31 |
377 | PO | RS | RA | RB | XO | / |
378 ```
379
380 * stdupx RS,RA,RB
381
382 Pseudo-code:
383
384 ```
385 EA <- (RA) + (RB)
386 ea <- (RA)
387 MEM(ea, 8) <- (RS)
388 RA <- EA
389 ```
390
391 Special Registers Altered:
392
393 None
394
395 \newpage{}
396 [[!inline pages="openpower/isa/fixedload" raw=yes ]]
397 \newpage{}
398 [[!inline pages="openpower/isa/fixedstore" raw=yes ]]
399 \newpage{}
400 [[!inline pages="openpower/isa/fpload" raw=yes ]]
401 \newpage{}
402 [[!inline pages="openpower/isa/fpstore" raw=yes ]]
403 \newpage{}
404 [[!inline pages="openpower/isa/pifixedload" raw=yes ]]
405 \newpage{}
406 [[!inline pages="openpower/isa/pifixedstore" raw=yes ]]
407
408 [[!tag opf_rfc]]