(no commit message)
[libreriscv.git] / isa_conflict_resolution / ioctl.mdwn
1 # pluggable extensions
2
3 This proposal adds a standardised extension instructions to the RV
4 instruction set by introducing a fixed small number N (e.g. N = 8) of
5 R-type opcodes xcmd0 rd, rs1, rs2, .. , xcmd7 rd, rs1, rs2, that are intended to be used as "overloadable" (slightly crippled) R-type instructions for independently developed extensions in the form of non standard CPU extensions, IP tiles, or closely coupled external devices.
6
7 Tl;DR see below for a C description of how this is supposed to work.
8
9 The input value of an xcmd instruction in rs2 is arbitrary. The content of the first input rs1, however, is divided in a 12bit "logical unit" (lun) together with xlen - 12 bits of additional data.
10 The lun bits in rs1, determines a specific (sub)device, and the CPU routes the command to this device with rs1 and rs2 as input, and rd as output. Effectively, the xcmd0, ... xcmd7 instructions are "virtual method" opcodes, overloaded for different extension (sub)devices.
11
12 The specific value of the lun is supposed to be convenient for the cpu and is thus unstandardised. Portable software therefore constructs the lun, with a further R-type instruction xext. It takes a 20 bit universally unique identifier (UUID) that identifies an interface with upto N R-type instructions with the signature of xcmd. An optional sequence number identifies a specific enumerated device on the cpu that implements the interface as a subdevice. For convenience, xext also or's bits rs2[0..XLEN-12]. If the UUID is not recognised 0 is returned. , but implemented by the extension (sub)device. Note that this scheme gives an easy work around the restriction on N (e.g. 8 ) commands: an implementing device can simply implement several interfaces as routable subdevices, indeed is expected to do so.
13
14 The net effect is that a sequence like
15
16 //fake UUID
17 lui rd 0xEDCBA
18 xext rd rd rs1
19 xcmd0 rd rd rs2
20
21 acts like a single namespaced instruction cmd0_EDCBA rd rs1 rs2 with the annoying caveat that rs1 can only use bits 0..XLEN-12 (the sequence is also not indivisible but the crucial semantics that you might want to be indivisible is in xcmd0). Delegation is expected to come at a small
22 additional performance price compared to a "native" instruction. This should, however, be an acceptable tradeoff in many cases.
23
24
25 Programatically the instructions in the interface are just a set of glorified assembler macros
26
27 org.tinker.tinker:RocknRoll{
28 uuid : 0xABCDE
29 rock rd rs1 rs2 : xcmd0 rd rs1 rs2
30 roll rd rs1 rs2 : xcmd1 rd rs1 rs2
31 }
32
33 so that the above sequence is more clearly written as
34
35 import(org.tinker.tinker:RocknRoll)
36 lui rd org.tinker.tinker:RocknRoll:uuid
37 xext rd rd rs1
38 org.tinker.tinker:RocknRoll:rock rd rd rs2
39
40 (Quite possibly even glorified standard assembler macros are overkill and it is easier to use defines or ordinary macro's with long names. E.g. writing
41
42 #define org_tinker_tinker__RocknRoll__interface_uuid 0xABCDE
43 #define org_tinker_tinker__RocknRoll__rock(rd, rs1, rs2) xcmd0 rd, rs1, rs2
44 #define org_tinker_tinker__RocknRoll__roll(rd, rs1, rs2) xcmd1 rd, rs1, rs2
45
46 allows the same sequence to be written as
47
48 lui rd org_tinker_tinker__RocknRoll__interface_uuid
49 xext rd rs1
50 org_tinker_tinker__RocknRoll__rock(rd, rd, rs2)
51
52 Readability of assembler is no big deal for a compiler, but people are supposed to _document_ the interface and its semantics. In particular a semantics specified like the semantics of the cpu would be most welcome.)
53
54
55 If several instructions of the same interface are used, one can also use instruction sequences like
56
57 lui t1 org_tinker_tinker__RocknRoll__interface_uuid
58 xext t1 zero
59 xcmd0 a5, t1, a0 // org_tinker_tinker__RocknRoll__rock(a5, t1, a0)
60 xcmd1 t2, t1, a1 // org_tinker_tinker__RocknRoll__roll(t2, t1, a5)
61 xcmd0 a0, t1, t2 // org_tinker_tinker__RocknRoll__rock(a0, t1, t2)
62
63 This amortises the cost of the xext instruction.
64
65 ==Implications for the RiscV ecosystem ==
66
67
68 The proposal allows independent groups to define one or more extension
69 interfaces of (slightly crippled) R-type instructions implemented by an
70 extension device. Such an extension device would be an native but non standard
71 extension of the CPU, an IP tile or a closely coupled external chip and would
72 be configured at manufacturing time or bootup of the CPU.
73
74 Having a standardised overloadable interface simply avoids much of the
75 need for isa extensions for hardware with non standard interfaces and
76 semantics. This is analogous to the way that the standardised overloadable
77 ioctl interface of the kernel almost completely avoids the need for
78 extending the kernel with syscalls for the myriad of hardware devices
79 with their specific interfaces and semantics.
80
81 The expanded flexibility comes at the cost: the standard can specify the
82 semantics of the delegation mechanism and the interfacing with the rest
83 of the cpu, but the actual semantics of the overloaded instructions can
84 only be defined by the designer of the interface. Likewise, a device
85 can be conforming as far as delegation and interaction with the CPU
86 is concerned, but whether the hardware is conforming to the semantics
87 of the interface is outside the scope of spec. Being able to specify
88 that semantics using the methods used for RV itself is clearly very
89 valuable. One impetus for doing that is using it for purposes of its own,
90 effectively freeing opcode space for other purposes. Also, some interfaces
91 may become de facto or de jure standards themselves, necessitating
92 hardware to implement competing interfaces. I.e., facilitating a free
93 for all, may lead to standards proliferation. C'est la vie.
94
95 The only "ISA-collisions" that can still occur are in the 20 bit (~10^6)
96 interface identifier space, with 12 more bits to identify a device on
97 a hart that implements the interface. One suggestion is setting aside
98 2^19 id's that are handed out for a small fee by a central (automated)
99 registration (making sure the space is not just claimed), while the
100 remaining 2^19 are used as a good hash on a long, plausibly globally
101 unique human readable interface name. This gives implementors the choice
102 between a guaranteed private identifier paying a fee, or relying on low
103 probabilities. On RV64 the UUID can also be extended to 52 bits (> 10^15).
104
105
106 ==== Description of the extension as C functions.==
107
108 /* register format of rs1 for xext instructions */
109 typedef struct uuid_device{
110 long dev:12;
111 long uuid: 8*sizeof(long) - 12;
112 } uuid_device_t
113
114 /* register format for rd of xext and rs1 for xcmd instructions, packs lun and data */
115 typedef struct lun_data{
116 long lun:12;
117 long data: 8*sizeof(long) - 12;
118 } lun_data_t
119
120 /* proposed R-type instructions
121 xext rd rs1 rs2
122 xcmd0 rd rs1 rs2
123 xcmd1 rd rs1 rs2
124 ...
125 xcmd7 rd rs1 rs2
126 */
127
128 lun_data_t xext(uuid_dev_t rs1, long rs2);
129 long xcmd0(lun_data_t rs1, long rs2);
130 long xcmd1(lun_data_t rs1, long rs2);
131 ...
132 long xcmd<N>(lun_data_t rs1, long rs2);
133
134 /* hardware interface presented by an implementing device. */
135 typedef
136 long device_fn(unsigned short subdevice_xcmd, lun_data_t rs1, long rs2);
137
138 /* cpu internal datatypes */
139
140 enum privilege = {user = 0b0001, super = 0b0010, hyper = 0b0100, mach = 0b1000};
141
142 /* cpu internal, does what is on the label */
143 static
144 enum privilege cpu__current_privilege_level()
145
146 typedef
147 struct lun{
148 unsigned short id:12
149 } lun_t;
150
151 struct uuid_device_priv2lun{
152 struct{
153 uuid_dev_t uuid_dev;
154 enum privilege reqpriv;
155 };
156 lun_t lun;
157 };
158
159 struct device_subdevice{
160 device_fn* device_addr;
161 unsigned short subdeviceId:12;
162 };
163
164 struct lun_priv2device_subdevice{
165 struct{
166 lun_t lun;
167 enum privilege reqpriv
168 }
169 struct device_subdevice devAddr_subdevId;
170 }
171
172 static
173 struct uuid_device_priv2lun cpu__lun_map[];
174
175 /*
176 map (UUID, device, privilege) to a 12 bit lun,
177 return (lun_t){0} on unknown or no access
178
179 does associative memory lookup and tests privilege.
180 */
181 static
182 lun_t cpu__lookup_lun(const struct uuid_device_priv2lun* lun_map, uuid_dev_t uuid_dev, enum privilege priv);
183
184
185
186 lun_data_t xext(uuid_dev_t rs1, long rs2)
187 {
188 lun_t lun = cpu__lookup_lun(lun_map, rs1, current_privilege_level());
189
190 return (lun_data_t){.lun = lun.id, .data = rs2 % (1<< (8*sizeof(long) - 12))}
191 }
192
193
194
195
196 struct lun_priv2device_subdevice cpu__device_subdevice_map[];
197
198 /* map (lun, priv) to struct device_subdevice pair.
199 For lun = 0, or unknown (lun, priv) pair, returns (struct device_subdevice){NULL,0}
200 */
201 static
202 device_subdevice_t cpu__lookup_device_subdevice(const struct lun_priv2device_subdevice_map* dev_subdev_map,
203 lun_t lun, enum privileges priv);
204
205 /* functional description of the delegating xcmd0 .. xcmd7 instructions */
206 template<k = 0..N-1> //pretend this is C
207 long xcmd<k>(lun_data_t rs1, long rs2)
208 {
209 struct device_subdevice dev_subdev = cpu_lookup_device_subdevice(device_subdevice_map, rs1.lun, current_privilege());
210 if(dev_subdev.devAddr == NULL)
211 trap(“Illegal instruction”);
212
213 return dev_subdev.devAddr(dev_subdev.subdevId | k << 12, rs1, rs2);
214 }
215
216
217
218 Example:
219
220 #define COM_BIGBUCKS__FROBATE__INTERFACE_UUID 0xABCDE
221 #define ORG_TINKER_TINKER__ROCKNROLL_INTERFACE_UUID 0x12345
222 #define ORG_TINKER_TINKER__JAZZ_INTERFACE_UUID 0xD0B0D
223 /*
224 com.bigbucks:Frobate{
225 uuid: COM_BIGBUCKS__FROBATE__INTERFACE_UUID
226 frobate rd rs1 rs2 : cmd0 rd rs1 rs2
227 foo rd rs1 rs2 : cmd1 rd rs1 rs2
228 bar rd rs1 rs2 : cmd1 rd rs1 rs2
229 }
230 */
231 org.tinker.tinker:RocknRoll{
232 uuid: ORG_TINKER_TINKER__ROCKNROLL_INTERFACE_UUID
233 rock rd rs1 rs2: cmd0 rd rs1 rs2
234 roll rd rs1 rs2: cmd1 rd rs1 rs2
235 }
236
237 long com_bigbucks__device1(short subdevice_xcmd, lun_data_t rs1, long rs2)
238 {
239 switch(subdevice_xcmd) {
240 case 0 | 0 << 12 /* com.bigbucks:Frobate:frobate */ : return device1_frobate(rs1, rs2);
241 case 42| 0 << 12 /* com.bigbucks:FrobateMach:frobate : return device1_frobate_machine_level(rs1, rs2);
242 case 0 | 1 << 12 /* com.bigbucks:Frobate:foo */ : return device1_foo(rs1, rs2);
243 case 0 | 2 << 12 /* com.bigbucks:Frobate:bar */ : return device1_bar(rs1, rs2);
244 case 1 | 0 << 12 /* org.tinker.tinker:RocknRoll:rock */ : return device1_rock(rs1, rs2);
245 case 1 | 1 << 12 /* org.tinker.tinker:RocknRoll:roll */ : return device1_roll(rs1, rs2);
246 default: trap(“hardware configuration error”);
247 }
248 }
249
250 /*
251 org.tinker.tinker:Jazz{
252 uuid: ORG_TINKER_TINKER__JAZZ_INTERFACE_UUID
253 boogy rd rs1 rs2: cmd0 rd rs1 rs2
254 }
255 */
256
257 long org_tinker_tinker__device2(short subdevice_xcmd, lun_data_t rs1, long rs2)
258 {
259 switch(dev_cmd.interfId){
260 case 0 | 0 << 12 /* com.bigbucks:Frobate:frobate */: return device2_frobate(rs1, rs2);
261 case 0 | 1 << 12 /* com.bigbucks:Frobate:foo */ : return device2_foo(rs1, rs2);
262 case 0 | 2 << 12 /* com.bigbucks:Frobate:bar */ : return device2_foo(rs1, rs2);
263 case 1 | 0 << 12 /* org_tinker_tinker:Jazz:boogy */: return device2_boogy(rs1, rs2);
264 default: trap(“hardware configuration error”);
265 }
266 }
267
268 /* struct lun2dev_subdevice_map[] */
269 dev_subdevice_map = {
270 // {.lun = 0, error and falls back to trapping xcmd
271 {{.lun = 1, .priv = user}, .devAddr_interfId = {fallback, 0 /* ReturnZero */}},
272 {{.lun = 1, .priv = super}, .devAddr_interfId = {fallback, 0 /* ReturnZero */}},
273 {{.lun = 1, .priv = hyper}, .devAddr_interfId = {fallback, 0 /* ReturnZero */}},
274 {{.lun = 1, .priv = mach}, .devAddr_interfId = {fallback, 0 /* ReturnZero */}},
275 {{.lun = 2, .priv = user}, .devAddr_interfId = {fallback, 1 /* ReturnMinusOne*/}},
276 {{.lun = 2, .priv = super}, .devAddr_interfId = {fallback, 1 /* ReturnMinusOne*/}},
277 {{.lun = 2, .priv = hyper}, .devAddr_interfId = {fallback, 1 /* ReturnMinusOne*/}},
278 {{.lun = 2, .priv = mach}, .devAddr_interfId = {fallback, 1 /* ReturnMinusOne*/}},
279 // .lun = 3 .. 7 reserved for other fallback RV interfaces
280 // .lun = 8 .. 30 reserved as error numbers, c.li t1 31; bltu rd t1 L_fail tests errors
281 // .lun = 31 reserved out of caution
282 {{.lun = 32, .priv = user}, .devAddr_interfId = {device1, 0 /* Frobate interface */}},
283 {{.lun = 32, .priv = super}, .devAddr_interfId = {device1, 0 /* Frobate interface */}},
284 {{.lun = 32, .priv = hyper}, .devAddr_interfId = {device1, 0 /* Frobate interface */}},
285 {{.lun = 32, .priv = mach}, .devAddr_interfId = {device1,64 /* Frobate machine level interface */}},
286 {{.lun = 33, .priv = user}, .devAddr_InterfId = {device1, 1 /* RocknRoll interface */}},
287 {{.lun = 33, .priv = super}, .devAddr_InterfId = {device1, 1 /* RocknRoll interface */}},
288 {{.lun = 33, .priv = hyper}, .devAddr_InterfId = {device1, 1 /* RocknRoll interface */}},
289 {{.lun = 34, .priv = super}, .devAddr_interfId = {device2, 0 /* Frobate interface */}},
290 {{.lun = 34, .priv = hyper}, .devAddr_interfId = {device2, 0 /* Frobate interface */}},
291 {{.lun = 34, .priv = mach}, .devAddr_interfId = {device2, 0 /* Frobate interface */}},
292 {{.lun = 35, .priv = super}, .devAddr_interfId = {device2, 1 /* Jazz interface */}},
293 {{.lun = 35, .priv = hyper}, .devAddr_interfId = {device2, 1 /* Jazz interface */}},
294 }
295
296
297 /* struct uuid_dev2lun_map[] */
298 lun_map = {
299 {{.uuid_devId = {ORG_RISCV__FALLBACK__RETURN_ZERO__INTERFACE_UUID , 0}, .priv = user}, .lun = 1},
300 {{.uuid_devId = {ORG_RISCV__FALLBACK__RETURN_ZERO__INTERFACE_UUID , 0}, .priv = super}, .lun = 1},
301 {{.uuid_devId = {ORG_RISCV__FALLBACK__RETURN_ZERO__INTERFACE_UUID , 0}, .priv = hyper}, .lun = 1},
302 {{.uuid_devId = {ORG_RISCV__FALLBACK__RETURN_ZERO__INTERFACE_UUID , 0}, .priv = mach} .lun = 1},
303 {{.uuid_devId = {ORG_RISCV__FALLBACK__RETURN_MINUSONE__INTERFACE_UUID, 0}, .priv = user}, .lun = 2},
304 {{.uuid_devId = {ORG_RISCV__FALLBACK__RETURN_MINUSONE__INTERFACE_UUID, 0}, .priv = super},.lun = 2},
305 {{.uuid_devId = {ORG_RISCV__FALLBACK__RETURN_MINUSONE__INTERFACE_UUID, 0}, .priv = hyper},.lun = 2},
306 {{.uuid_devId = {ORG_RISCV__FALLBACK__RETURN_MINUSONE__INTERFACE_UUID, 0}, .priv = mach}, .lun = 2},
307 {{.uuid_devId = {COM_BIGBUCKS__FROBATE__INTERFACE_UUID, 0}, .priv = user} .lun = 32}, //32 sic!
308 {{.uuid_devId = {COM_BIGBUCKS__FROBATE__INTERFACE_UUID, 0}, .priv = super} .lun = 32},
309 {{.uuid_devId = {COM_BIGBUCKS__FROBATE__INTERFACE_UUID, 0}, .priv = hyper} .lun = 32},
310 {{.uuid_devId = {COM_BIGBUCKS__FROBATE__INTERFACE_UUID, 0}, .priv = mach} .lun = 32},
311 {{.uuid_devId = {COM_BIGBUCKS__FROBATE__INTERFACE_UUID, 1}, .priv = super} .lun = 34}, //34 sic!
312 {{.uuid_devId = {COM_BIGBUCKS__FROBATE__INTERFACE_UUID, 1}, .priv = hyper} .lun = 34},
313 {{.uuid_devId = {COM_BIGBUCKS__FROBATE__INTERFACE_UUID, 1}, .priv = mach} .lun = 34},
314 {{.uuid_devId = {ORG_TINKER_TINKER__ROCKNROLL__INTERFACE_UUID, 0}, .priv = user} .lun = 33}, //33 sic!
315 {{.uuid_devId = {ORG_TINKER_TINKER__ROCKNROLL__INTERFACE_UUID, 0}, .priv = super} .lun = 33},
316 {{.uuid_devId = {ORG_TINKER_TINKER__ROCKNROLL__INTERFACE_UUID, 0}, .priv = hyper} .lun = 33},
317 {{.uuid_devId = {ORG_TINKER_TINKER__JAZZ__INTERFACE_UUID, 0}, .priv = super}, .lun = 35},
318 {{.uuid_devId = {ORG_TINKER_TINKER__JAZZ__INTERFACE_UUID, 0}, .priv = hyper}, .lun = 35},
319 }
320