(no commit message)
[libreriscv.git] / isa_conflict_resolution / ioctl.mdwn
1 # pluggable extensions
2
3 ==RB===
4
5
6 This proposal adds a standardised extension instructions to the RV
7 instruction set by introducing a fixed small number N (e.g. N = 8) of
8 R-type opcodes xcmd0 rd, rs1, rs2, .. , xcmd7 rd, rs1, rs2, that are intended to be used as "overloadable" (slightly crippled) R-type instructions for independently developed extensions in the form of non standard CPU extensions, IP tiles, or closely coupled external devices.
9
10 Tl;DR see below for a C description of how this is supposed to work.
11
12 The input value of an xcmd instruction in rs2 is arbitrary. The content of the first input rs1, however, is divided in a 12bit "logical unit" (lun) together with xlen - 12 bits of additional data.
13 The lun bits in rs1, determines a specific (sub)device, and the CPU routes the command to this device with rs1 and rs2 as input, and rd as output. Effectively, the xcmd0, ... xcmd7 instructions are "virtual method" opcodes, overloaded for different extension (sub)devices.
14
15 The specific value of the lun is supposed to be convenient for the cpu and is thus unstandardised. Portable software therefore constructs the lun, with a further R-type instruction xext. It takes a 20 bit universally unique identifier (UUID) that identifies an interface with upto N R-type instructions with the signature of xcmd. An optional sequence number identifies a specific enumerated device on the cpu that implements the interface as a subdevice. For convenience, xext also or's bits rs2[0..XLEN-12]. If the UUID is not recognised 0 is returned. , but implemented by the extension (sub)device. Note that this scheme gives an easy work around the restriction on N (e.g. 8 ) commands: an implementing device can simply implement several interfaces as routable subdevices, indeed is expected to do so.
16
17 The net effect is that a sequence like
18
19 //fake UUID
20 lui rd 0xEDCBA
21 xext rd rd rs1
22 xcmd0 rd rd rs2
23
24 acts like a single namespaced instruction cmd0_EDCBA rd rs1 rs2 with the annoying caveat that rs1 can only use bits 0..XLEN-12 (the sequence is also not indivisible but the crucial semantics that you might want to be indivisible is in xcmd0). Delegation is expected to come at a small
25 additional performance price compared to a "native" instruction. This should, however, be an acceptable tradeoff in many cases.
26
27
28 Programatically the instructions in the interface are just a set of glorified assembler macros
29
30 org.tinker.tinker:RocknRoll{
31 uuid : 0xABCDE
32 rock rd rs1 rs2 : xcmd0 rd rs1 rs2
33 roll rd rs1 rs2 : xcmd1 rd rs1 rs2
34 }
35
36 so that the above sequence is more clearly written as
37
38 import(org.tinker.tinker:RocknRoll)
39 lui rd org.tinker.tinker:RocknRoll:uuid
40 xext rd rd rs1
41 org.tinker.tinker:RocknRoll:rock
42
43 (Quite possibly even glorified standard assembler macros are overkill and it is easier to use defines or ordinary macro's with long names. E.g. writing
44
45 #define org_tinker_tinker__RocknRoll__interface_uuid 0xABCDE
46 #define org_tinker_tinker__RocknRoll__rock(rd, rs1, rs2) xcmd0 rd, rs1, rs2
47 #define org_tinker_tinker__RocknRoll__roll(rd, rs1, rs2) xcmd1 rd, rs1, rs2
48
49 allows the same sequence to be written as
50
51 lui rd org_tinker_tinker__RocknRoll__interface_uuid
52 xext rd rs1
53 org_tinker_tinker__RocknRoll__rock(rd, rd, rs2)
54
55 Readability of assembler is no big deal for a compiler, but people are supposed to _document_ the interface and its semantics. In particular a semantics specified like the semantics of the cpu would be most welcome.)
56
57
58 If several instructions of the same interface are used, one can also use instruction sequences like
59
60 lui t1 org_tinker_tinker__RocknRoll__interface_uuid
61 xext t1 zero
62 xcmd0 a5, t1, a0 // org_tinker_tinker__RocknRoll__rock(a5, t1, a0)
63 xcmd1 t2, t1, a1 // org_tinker_tinker__RocknRoll__roll(t2, t1, a5)
64 xcmd0 a0, t1, t2 // org_tinker_tinker__RocknRoll__rock(a0, t1, t2)
65
66 This amortises the cost of the xext instruction.
67
68 ==Implications for the RiscV ecosystem ==
69
70
71 The proposal allows independent groups to define one or more extension interfaces of (slightly crippled) R-type instructions
72 implemented by an extension device. Such an extension device would be an intrinsic non standard part of the CPU, an IP tile or a closely coupled external chip and would be configured at manufacturing time or bootup of the CPU.
73
74 Having a standardised overloadable interface simply avoids much of the
75 need for isa extensions for hardware with non standard interfaces and
76 semantics. This is analogous to the way that the standardised overloadable
77 ioctl interface of the kernel almost completely avoids the need for
78 extending the kernel with syscalls for the myriad of hardware devices
79 with their specific interfaces and semantics.
80
81 Since the rs1 input of the overloaded ext_ctl instruction's are taken
82 by the interface cookie, they are restricted in use compared to a normal
83 R-type instruction (it is possible to pass 12 bits of additional info by
84 or ing it with the cookie). Delegation is also expected to come at a small
85 additional performance price compared to a "native" instruction. This
86 should be an acceptable tradeoff in most cases.
87
88 The expanded flexibility comes at the cost: the standard can specify the
89 semantics of the delegation mechanism and the interfacing with the rest
90 of the cpu, but the actual semantics of the overloaded instructions can
91 only be defined by the designer of the interface. Likewise, a device
92 can be conforming as far as delegation and interaction with the CPU
93 is concerned, but whether the hardware is conforming to the semantics
94 of the interface is outside the scope of spec. Being able to specify
95 that semantics using the methods used for RV itself is clearly very
96 valuable. One impetus for doing that is using it for purposes of its own,
97 effectively freeing opcode space for other purposes. Also, some interfaces
98 may become de facto or de jure standards themselves, necessitating
99 hardware to implement competing interfaces. I.e., facilitating a free
100 for all, may lead to standards proliferation. C'est la vie.
101
102 The only "ISA-collisions" that can still occur are in the 20 bit (~10^6)
103 interface identifier space, with 12 more bits to identify a device on
104 a hart that implements the interface. One suggestion is setting aside
105 2^19 id's that are handed out for a small fee by a central (automated)
106 registration (making sure the space is not just claimed), while the
107 remaining 2^19 are used as a good hash on a long, plausibly globally
108 unique human readable interface name. This gives implementors the choice
109 between a guaranteed private identifier paying a fee, or relying on low
110 probabilities. On RV64 the UUID can also be extended to 52 bits (> 10^15).
111
112
113 ==== Description of the extension as C functions.==
114
115 /* register format of rs1 for xext instructions */
116 typedef struct uuid_device{
117 long dev:12;
118 long uuid: 8*sizeof(long) - 12;
119 } uuid_device_t
120
121 /* register format for rd of xext and rs1 for xcmd instructions, packs lun and data */
122 typedef struct lun_data{
123 long lun:12;
124 long data: 8*sizeof(long) - 12;
125 } lun_data_t
126
127 /* proposed R-type instructions
128 xext rd rs1 rs2
129 xcmd0 rd rs1 rs2
130 xcmd1 rd rs1 rs2
131 ...
132 xcmd7 rd rs1 rs2
133 */
134
135 lun_data_t xext(uuid_dev_t rs1, long rs2);
136 long xcmd0(lun_data_t rs1, long rs2);
137 long xcmd1(lun_data_t rs1, long rs2);
138 ...
139 long xcmd<N>(lun_data_t rs1, long rs2);
140
141 /* hardware interface presented by an implementing device. */
142 typedef
143 long device_fn(unsigned short subdevice_xcmd, lun_data_t rs1, long rs2);
144
145 /* cpu internal datatypes */
146
147 enum privilege = {user = 0b0001, super = 0b0010, hyper = 0b0100, mach = 0b1000};
148
149 /* cpu internal, does what is on the label */
150 static
151 enum privilege cpu__current_privilege_level()
152
153 typedef
154 struct lun{
155 unsigned short id:12
156 } lun_t;
157
158 struct uuid_device_priv2lun{
159 struct{
160 uuid_dev_t uuid_dev;
161 enum privilege reqpriv;
162 };
163 lun_t lun;
164 };
165
166 struct device_subdevice{
167 device_fn* device_addr;
168 unsigned short subdeviceId:12;
169 };
170
171 struct lun_priv2device_subdevice{
172 struct{
173 lun_t lun;
174 enum privilege reqpriv
175 }
176 struct device_subdevice devAddr_subdevId;
177 }
178
179 static
180 struct uuid_device_priv2lun cpu__lun_map[];
181
182 /*
183 map (UUID, device, privilege) to a 12 bit lun,
184 return (lun_t){0} on unknown or no access
185
186 does associative memory lookup and tests privilege.
187 */
188 static
189 lun_t cpu__lookup_lun(const struct uuid_device_priv2lun* lun_map, uuid_dev_t uuid_dev, enum privilege priv);
190
191
192
193 lun_data_t xext(uuid_dev_t rs1, long rs2)
194 {
195 lun_t lun = cpu__lookup_lun(lun_map, rs1, current_privilege_level());
196
197 return (lun_data_t){.lun = lun.id, .data = rs2 % (1<< (8*sizeof(long) - 12))}
198 }
199
200
201
202
203 struct lun_priv2device_subdevice cpu__device_subdevice_map[];
204
205 /* map (lun, priv) to struct device_subdevice pair.
206 For lun = 0, or unknown (lun, priv) pair, returns (struct device_subdevice){NULL,0}
207 */
208 static
209 device_subdevice_t cpu__lookup_device_subdevice(const struct lun_priv2device_subdevice_map* dev_subdev_map,
210 lun_t lun, enum privileges priv);
211
212 /* functional description of the delegating xcmd0 .. xcmd7 instructions */
213 template<k = 0..N-1> //pretend this is C
214 long xcmd<k>(lun_data_t rs1, long rs2)
215 {
216 struct device_subdevice dev_subdev = cpu_lookup_device_subdevice(device_subdevice_map, rs1.lun, current_privilege());
217 if(dev_subdev.devAddr == NULL)
218 trap(“Illegal instruction”);
219
220 return dev_subdev.devAddr(dev_subdev.subdevId | k << 12, rs1, rs2);
221 }
222
223
224
225 Example:
226
227 #define COM_BIGBUCKS__FROBATE__INTERFACE_UUID 0xABCDE
228 #define ORG_TINKER_TINKER__ROCKNROLL_INTERFACE_UUID 0x12345
229 #define ORG_TINKER_TINKER__JAZZ_INTERFACE_UUID 0xD0B0D
230
231 com.bigbucks:Frobate{
232 uuid: COM_BIGBUCKS__FROBATE__INTERFACE_UUID
233 frobate rd rs1 rs2 : cmd0 rd rs1 rs2
234 foo rd rs1 rs2 : cmd1 rd rs1 rs2
235 bar rd rs1 rs2 : cmd1 rd rs1 rs2
236 }
237
238 org.tinker.tinker:RocknRoll{
239 uuid: ORG_TINKER_TINKER__ROCKNROLL_INTERFACE_UUID
240 rock rd rs1 rs2: cmd0 rd rs1 rs2
241 roll rd rs1 rs2: cmd1 rd rs1 rs2
242 }
243
244 long com_bigbucks__device1(short subdevice_xcmd, lun_data_t rs1, long rs2)
245 {
246 switch(subdevice_xcmd) {
247 case 0 | 0 << 12 /* com.bigbucks:Frobate:frobate */ : return device1_frobate(rs1, rs2);
248 case 42| 0 << 12 /* com.bigbucks:FrobateMach:frobate : return device1_frobate_machine_level(rs1, rs2);
249 case 0 | 1 << 12 /* com.bigbucks:Frobate:foo */ : return device1_foo(rs1, rs2);
250 case 0 | 2 << 12 /* com.bigbucks:Frobate:bar */ : return device1_bar(rs1, rs2);
251 case 1 | 0 << 12 /* org.tinker.tinker:RocknRoll:rock */ : return device1_rock(rs1, rs2);
252 case 1 | 1 << 12 /* org.tinker.tinker:RocknRoll:roll */ : return device1_roll(rs1, rs2);
253 default: trap(“hardware configuration error”);
254 }
255 }
256
257 org.tinker.tinker:Jazz{
258 uuid: ORG_TINKER_TINKER__JAZZ_INTERFACE_UUID
259 boogy rd rs1 rs2: cmd0 rd rs1 rs2
260 }
261
262 long org_tinker_tinker__device2(short subdevice_xcmd, lun_data_t rs1, long rs2)
263 {
264 switch(dev_cmd.interfId){
265 case 0 | 0 << 12 /* com.bigbucks:Frobate:frobate */: return device2_frobate(rs1, rs2);
266 case 0 | 1 << 12 /* com.bigbucks:Frobate:foo */ : return device2_foo(rs1, rs2);
267 case 0 | 2 << 12 /* com.bigbucks:Frobate:bar */ : return device2_foo(rs1, rs2);
268 case 1 | 0 << 12 /* org_tinker_tinker:Jazz:boogy */: return device2_boogy(rs1, rs2);
269 default: trap(“hardware configuration error”);
270 }
271 }
272
273 /* struct lun2dev_subdevice_map[] */
274 dev_subdevice_map = {
275 // {.lun = 0, error and falls back to trapping xcmd
276 {{.lun = 1, .priv = user}, .devAddr_interfId = {fallback, 0 /* ReturnZero */}},
277 {{.lun = 1, .priv = super}, .devAddr_interfId = {fallback, 0 /* ReturnZero */}},
278 {{.lun = 1, .priv = hyper}, .devAddr_interfId = {fallback, 0 /* ReturnZero */}},
279 {{.lun = 1, .priv = mach}, .devAddr_interfId = {fallback, 0 /* ReturnZero */}},
280 {{.lun = 2, .priv = user}, .devAddr_interfId = {fallback, 1 /* ReturnMinusOne*/}},
281 {{.lun = 2, .priv = super}, .devAddr_interfId = {fallback, 1 /* ReturnMinusOne*/}},
282 {{.lun = 2, .priv = hyper}, .devAddr_interfId = {fallback, 1 /* ReturnMinusOne*/}},
283 {{.lun = 2, .priv = mach}, .devAddr_interfId = {fallback, 1 /* ReturnMinusOne*/}},
284 // .lun = 3 .. 7 reserved for other fallback RV interfaces
285 // .lun = 8 .. 30 reserved as error numbers, c.li t1 31; bltu rd t1 L_fail tests errors
286 // .lun = 31 reserved out of caution
287 {{.lun = 32, .priv = user}, .devAddr_interfId = {device1, 0 /* Frobate interface */}},
288 {{.lun = 32, .priv = super}, .devAddr_interfId = {device1, 0 /* Frobate interface */}},
289 {{.lun = 32, .priv = hyper}, .devAddr_interfId = {device1, 0 /* Frobate interface */}},
290 {{.lun = 32, .priv = mach}, .devAddr_interfId = {device1,42 /* Frobate machine level interface */}},
291 {{.lun = 33, .priv = user}, .devAddr_InterfId = {device1, 1 /* RocknRoll interface */}},
292 {{.lun = 33, .priv = super}, .devAddr_InterfId = {device1, 1 /* RocknRoll interface */}},
293 {{.lun = 33, .priv = hyper}, .devAddr_InterfId = {device1, 1 /* RocknRoll interface */}},
294 {{.lun = 34, .priv = super}, .devAddr_interfId = {device2, 0 /* Frobate interface */}},
295 {{.lun = 34, .priv = hyper}, .devAddr_interfId = {device2, 0 /* Frobate interface */}},
296 {{.lun = 34, .priv = mach}, .devAddr_interfId = {device2, 0 /* Frobate interface */}},
297 {{.lun = 35, .priv = super}, .devAddr_interfId = {device2, 1 /* Jazz interface */}},
298 {{.lun = 35, .priv = hyper}, .devAddr_interfId = {device2, 1 /* Jazz interface */}},
299 }
300
301
302 /* struct uuid_dev2lun_map[] */
303 lun_map = {
304 {.uuid_devId = {ORG_RISCV__FALLBACK__RETURN_ZERO__INTERFACE_UUID , 0}, {.lun = 1, .priv = user | super | hyper | mach },
305 {.uuid_devId = {ORG_RISCV__FALLBACK__RETURN_MINUSONE__INTERFACE_UUID, 0}, {.lun = 2, .priv = user | super | hyper | mach },
306 {.uuid_devId = {COM_BIGBUCKS__FROBATE__INTERFACE_UUID, 0}, {.lun = 32, .priv = },
307 {.uuid_devId = {COM_BIGBUCKS__FROBATE__INTERFACE_UUID, 1}, .lun = 34}, //sic!
308 {.uuid_devId = {ORG_TINKER_TINKER__ROCKNROLL__INTERFACE_UUID, 0}, .lun = 33}, //sic!
309 {.uuid_devId = {ORG_TINKER_TINKER__JAZZ__INTERFACE_UUID, 0}, .lun = 35}
310 }
311