1 # RFC ls011 LD/ST-Update-PostIncrement
5 * <https://bugs.libre-soc.org/show_bug.cgi?id=1048>
6 * <https://libre-soc.org/openpower/sv/rfc/ls011/>
7 * <https://bugs.libre-soc.org/show_bug.cgi?id=1045>
8 * <https://git.openpower.foundation/isa/PowerISA/issues/TODO>
14 **Date**: 21 Apr 2023.
20 **Books and Section affected**:
23 Chapter 2 Book I, new Fixed-Point Load / Store Sections 3.3.2 3.3.3
24 Chapter 4 Book I, new Floating-Point Load / Store Sections 4.6.2 4.6.3
33 **Submitter**: Luke Leighton (Libre-SOC)
35 **Requester**: Libre-SOC
37 **Impact on processor**:
43 **Impact on software**:
46 Requires support for new instructions in assembler, debuggers, and related tools.
47 Reduces instructions in hot-loops
60 **Notes and Observations**:
66 Add the following entries to:
68 * A new "Vector Looping" Book
69 * New Vector-Looping Chapters
70 * New Vector-Looping Appendices
78 TODO (key stub notes below)
82 The following instructions are proposed to be added in EXT2xx,
83 duplicating LD/ST-Update functionality but moving the update
84 of RA to *after* the Memory operation. These types of
85 instructions are already present in x86 (sort-of).
87 * x86 chose that store should be pre-indexed and load should be post-indexed
88 * Power ISA chose everything to be pre-indexed
89 * Motorola 68000 (decades old) has pre- and post- indexed
91 <https://tack.sourceforge.net/olddocs/m68020.html#2.2.2.%20Extra%20MC68020%20addressing%20modes>
93 <https://azeria-labs.com/memory-instructions-load-and-store-part-4/>
95 The LD/ST-Immediate-Post-Increment instructions are all Primary
96 Opcode: there are 13 of these. LD/ST-Indexed-Post-Increment
97 are all effectively 9-bit XO and consequently may easily
98 fit into one single Primary Opcode. EXT2xx is recommended.
100 One alternative idea is that bit 31 could be allocated (retrospectively)
101 to Post-Increment. Although it may be too late for Scalar Power ISA
102 it **may** be possible to consider for SVP64Single and/or SVP64-Vector,
103 but this risks creating a non-Orthogonal ISA.
108 # LD/ST-Postincrement
109 lbzup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W
110 lbzupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W
111 lhzup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W
112 lhzupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W
113 lhaup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W
114 lhaupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W
115 lwzup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W
116 lwzupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W
117 lwaupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W
118 ldup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W
119 ldupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W
120 stbup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedstore, 2R1W
121 stbupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W
122 sthup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedstore, 2R1W
123 sthupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W
124 stwup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedstore, 2R1W
125 stwupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W
126 stdup, ls011, high, PO, yes, EXT2xx, no, isa/pifixedstore, 2R1W
127 stdupx, ls011, high, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W
129 # FP LD/ST-Postincrement
130 lfdu, ls011, high, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W
131 lfsu, ls011, high, PO, yes, EXT2xx, no, isa/pifixedload, 1R2W
132 lfdux, ls011, high, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W
133 lsdux, ls011, high, 10, yes, EXT2xx, no, isa/pifixedload, 2R2W
134 stfdu, ls011, high, PO, yes, EXT2xx, no, isa/pifixedstore, 2R1W
135 stfsu, ls011, high, PO, yes, EXT2xx, no, isa/pifixedstore, 2R1W
136 stfdux, ls011, high, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W
137 stfsux, ls011, high, 10, yes, EXT2xx, no, isa/pifixedstore, 3R1W
139 # LD/ST-Shifted-Postincrement
140 lbzuspx, ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W
141 lhzuspx, ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W
142 lhauspx, ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W
143 lwzuspx, ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W
144 lwauspx, ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W
145 lduspx, ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W
146 stbuspx, ls011, med, 10, yes, EXT2xx, no, ls011, 3R1W
147 sthuspx, ls011, med, 10, yes, EXT2xx, no, ls011, 3R1W
148 stwuspx, ls011, med, 10, yes, EXT2xx, no, ls011, 3R1W
149 stduspx, ls011, med, 10, yes, EXT2xx, no, ls011, 3R1W
151 # FP LD/ST-Shifted-Postincrement
152 lfdupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W
153 lfsupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 2R2W
154 stfdupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 3R1W
155 stfsupsx, ls011, med, 10, yes, EXT2xx, no, ls011, 3R1W
161 Here is an annotated example where the pseudo-code changes to
162 just use `RA` as the address, otherwise remaining the same.
163 No actual change to the Effective Address computation itself
164 occurs, in any of the Post-Update instructions.
166 **Load Byte and Zero with Post-Update**
175 EA <- (RA) # EA just RA
176 RT <- ([0] * (XLEN-8)) || MEM(EA, 1) # then load
177 RA <- (RA) + EXTS(D) # then update RA after
180 Special Registers Altered:
186 where the same pseudocode for `lbzu` is:
189 EA <- (RA) + EXTS(D) # EA includes D
190 RT <- ([0] * (XLEN-8)) || MEM(EA, 1) # load from RA+D
191 RA <- EA # and update RA
197 # Fixed-point Load with Post-Update
199 Add the following additional Section to Fixed-Point Load: Book I 3.3.2.1
201 ## Load Byte and Zero with Post-Update
206 |0 |6 |9 |10 |11 |16 |31 |
216 RT <- ([0] * (XLEN-8)) || MEM(EA, 1)
220 Let the effective address (EA) be (RA|0).
221 The byte in storage addressed by EA is loaded into
222 RT[56:63]. RT[0:55] are set to 0.
224 The sum (RA|0)+D is placed into register RA.
226 If RA=0 or RA=RT, the instruction form is invalid.
228 Special Registers Altered:
232 ## Load Byte and Zero with Post-Update Indexed
237 |0 |6 |7|8|9 |10 |11|12|13 |15|16|17 |20|21 |31 |
238 | PO | RT | RA | RB | XO | / |
247 RT <- ([0] * (XLEN-8)) || MEM(EA, 1)
251 Let the effective address (EA) be (RA).
252 The byte in storage addressed by EA is loaded into
253 RT[56:63]. RT[0:55] are set to 0.
255 The sum (RA)+(RB) is placed into register RA.
257 If RA=0 or RA=RT, the instruction form is invalid.
259 Special Registers Altered:
263 ## Load Halfword and Zero with Post-Update
268 |0 |6 |9 |10 |11 |16 |31 |
278 RT <- ([0] * (XLEN-16)) || MEM(EA, 2)
282 Let the effective address (EA) be (RA|0).
283 The halfword in storage addressed by EA is loaded into
284 RT[48:63]. RT[0:47] are set to 0.
286 The sum (RA|0)+D is placed into register RA.
288 If RA=0 or RA=RT, the instruction form is invalid.
290 Special Registers Altered:
294 ## Load Halfword and Zero with Post-Update Indexed
299 |0 |6 |7|8|9 |10 |11|12|13 |15|16|17 |20|21 |31 |
300 | PO | RT | RA | RB | XO | / |
309 RT <- ([0] * (XLEN-16)) || MEM(EA, 2)
313 Let the effective address (EA) be (RA).
314 The halfword in storage addressed by EA is loaded into
315 RT[48:63]. RT[0:47] are set to 0.
317 The sum (RA)+(RB) is placed into register RA.
319 If RA=0 or RA=RT, the instruction form is invalid.
321 Special Registers Altered:
325 ## Load Halfword Algebraic with Post-Update
330 |0 |6 |9 |10 |11 |16 |31 |
340 RT <- EXTS(MEM(EA, 2))
344 Special Registers Altered:
348 ## Load Halfword Algebraic with Post-Update Indexed
353 |0 |6 |7|8|9 |10 |11|12|13 |15|16|17 |20|21 |31 |
354 | PO | RT | RA | RB | XO | / |
363 RT <- EXTS(MEM(EA, 2))
367 Special Registers Altered:
371 ## Load Word and Zero with Post-Update
376 |0 |6 |9 |10 |11 |16 |31 |
386 RT <- [0]*32 || MEM(EA, 4)
390 Let the effective address (EA) be (RA|0).
391 The word in storage addressed by EA is loaded into
392 RT[32:63]. RT[0:31] are set to 0.
394 The sum (RA|0)+D is placed into register RA.
396 If RA=0 or RA=RT, the instruction form is invalid.
398 Special Registers Altered:
402 ## Load Word and Zero with Post-Update Indexed
407 |0 |6 |7|8|9 |10 |11|12|13 |15|16|17 |20|21 |31 |
408 | PO | RT | RA | RB | XO | / |
417 RT <- [0] * 32 || MEM(EA, 4)
421 Let the effective address (EA) be (RA).
422 The word in storage addressed by EA is loaded into
423 RT[32:63]. RT[0:31] are set to 0.
425 The sum (RA)+(RB) is placed into register RA.
427 If RA=0 or RA=RT, the instruction form is invalid.
429 Special Registers Altered:
433 ## Load Word Algebraic with Post-Update Indexed
438 |0 |6 |7|8|9 |10 |11|12|13 |15|16|17 |20|21 |31 |
439 | PO | RT | RA | RB | XO | / |
448 RT <- EXTS(MEM(EA, 4))
452 Special Registers Altered:
456 ## Load Doubleword with Post-Update Indexed
467 RA <- (RA) + EXTS(DS || 0b00)
470 Special Registers Altered:
474 ## Load Doubleword with Post-Update Indexed
479 |0 |6 |7|8|9 |10 |11|12|13 |15|16|17 |20|21 |31 |
480 | PO | RT | RA | RB | XO | / |
493 Special Registers Altered:
501 # Fixed-Point Store Post-Update
503 Add the following as a new section in Fixed-Point Store, Book I
505 ## Store Byte with Update
510 |0 |6 |9 |10 |11 |16 |31 |
521 MEM(ea, 1) <- (RS)[XLEN-8:XLEN-1]
525 Special Registers Altered:
529 ## Store Byte with Update Indexed
534 |0 |6 |7|8|9 |10 |11|12|13 |15|16|17 |20|21 |31 |
535 | PO | RS | RA | RB | XO | / |
545 MEM(ea, 1) <- (RS)[XLEN-8:XLEN-1]
549 Special Registers Altered:
553 ## Store Halfword with Update
558 |0 |6 |9 |10 |11 |16 |31 |
569 MEM(ea, 2) <- (RS)[XLEN-16:XLEN-1]
573 Special Registers Altered:
577 ## Store Halfword with Update Indexed
582 |0 |6 |7|8|9 |10 |11|12|13 |15|16|17 |20|21 |31 |
583 | PO | RS | RA | RB | XO | / |
593 MEM(ea, 2) <- (RS)[XLEN-16:XLEN-1]
597 Special Registers Altered:
601 ## Store Word with Update
606 |0 |6 |9 |10 |11 |16 |31 |
617 MEM(ea, 4) <- (RS)[XLEN-32:XLEN-1]
621 Special Registers Altered:
625 ## Store Word with Update Indexed
630 |0 |6 |7|8|9 |10 |11|12|13 |15|16|17 |20|21 |31 |
631 | PO | RS | RA | RB | XO | / |
641 MEM(ea, 4) <- (RS)[XLEN-32:XLEN-1]
645 Special Registers Altered:
649 ## Store Doubleword with Update
658 EA <- (RA) + EXTS(DS || 0b00)
664 Special Registers Altered:
668 ## Store Doubleword with Update Indexed
673 |0 |6 |7|8|9 |10 |11|12|13 |15|16|17 |20|21 |31 |
674 | PO | RS | RA | RB | XO | / |
688 Special Registers Altered: