implemented switch CFG parsing
[kazan.git] / docs / SimpleV Grouping Proposal.md
1 # Proposal for Element-Grouping in SimpleV on RV64GC
2
3 #### Notes
4
5 i*N* is used to denote a sign-agnostic *N*-bit integer,
6 similarly f*N* is a *N*-bit floating-point number. *VL* is used to denote the current vector length.
7
8 For the unused elements in an integer register, the used element closest to the MSB is sign-extended on write and the unused elements are ignored on read.
9 The unused elements in a floating-point register are treated as-if they are set to all ones on write and are ignored on read, matching the existing standard for storing smaller FP values in larger registers.
10
11 For grouped modes, *VL* denotes the number of groups, not the number of elements.
12
13 ### Group Size 1
14
15 | Mode | # Regs | Regs/Group | Elms/Reg | Packed/Non-packed SIMD |
16 |------------|--------|------------|----------|--------------------------|
17 | i8x1x*VL* | 1x*VL* | 1 | 1 | i64x1x*VL* (Non-packed) |
18 | i16x1x*VL* | 1x*VL* | 1 | 1 | i64x1x*VL* (Non-packed) |
19 | i32x1x*VL* | 1x*VL* | 1 | 1 | i64x1x*VL* (Non-packed) |
20 | i64x1x*VL* | 1x*VL* | 1 | 1 | i64x1x*VL* (Non-packed) |
21 | f16x1x*VL* | 1x*VL* | 1 | 1 | f16x1x*VL* (Non-packed) |
22 | f32x1x*VL* | 1x*VL* | 1 | 1 | f32x1x*VL* (Non-packed) |
23 | f64x1x*VL* | 1x*VL* | 1 | 1 | f64x1x*VL* (Non-packed) |
24
25 ### Group Size 2
26
27 | Mode | # Regs | Regs/Group | Elms/Reg | Packed/Non-packed SIMD |
28 |------------|--------|------------|----------|--------------------------|
29 | i8x2x*VL* | 1x*VL* | 1 | 2 | Not supported |
30 | i16x2x*VL* | 1x*VL* | 1 | 2 | Not supported |
31 | i32x2x*VL* | 1x*VL* | 1 | 2 | i32x2x*VL* (Packed) |
32 | i64x2x*VL* | 2x*VL* | 2 | 1 | i64x2x*VL* (Non-packed)* |
33 | f16x2x*VL* | 1x*VL* | 1 | 2 | Not supported |
34 | f32x2x*VL* | 1x*VL* | 1 | 2 | f32x2x*VL* (Packed) |
35 | f64x2x*VL* | 2x*VL* | 2 | 1 | f64x2x*VL* (Non-packed)* |
36
37 \* Not supported unless *VL* is changed
38
39 ### Group Size 3
40
41 | Mode | # Regs | Regs/Group | Elms/Reg | Packed/Non-packed SIMD |
42 |------------|--------|------------|----------|--------------------------|
43 | i8x3x*VL* | 1x*VL* | 1 | 3 | Not supported |
44 | i16x3x*VL* | 1x*VL* | 1 | 3 | Not supported |
45 | i32x3x*VL* | 2x*VL* | 2 | 2 | Not supported |
46 | i64x3x*VL* | 3x*VL* | 3 | 1 | i64x3x*VL* (Non-packed)* |
47 | f16x3x*VL* | 1x*VL* | 1 | 3 | Not supported |
48 | f32x3x*VL* | 2x*VL* | 2 | 2 | Not supported |
49 | f64x3x*VL* | 3x*VL* | 3 | 1 | f64x3x*VL* (Non-packed)* |
50
51 \* Not supported unless *VL* is changed
52
53 ### Group Size 4
54
55 | Mode | # Regs | Regs/Group | Elms/Reg | Packed/Non-packed SIMD |
56 |------------|--------|------------|----------|--------------------------|
57 | i8x4x*VL* | 1x*VL* | 1 | 4 | Not supported |
58 | i16x4x*VL* | 1x*VL* | 1 | 4 | i16x4x*VL* (Packed) |
59 | i32x4x*VL* | 2x*VL* | 2 | 2 | i32x4x*VL* (Packed)* |
60 | i64x4x*VL* | 4x*VL* | 4 | 1 | i64x4x*VL* (Non-packed)* |
61 | f16x4x*VL* | 1x*VL* | 1 | 4 | f16x4x*VL* (Packed) |
62 | f32x4x*VL* | 2x*VL* | 2 | 2 | f32x4x*VL* (Packed)* |
63 | f64x4x*VL* | 4x*VL* | 4 | 1 | f64x4x*VL* (Non-packed)* |
64
65 \* Not supported unless *VL* is changed
66
67 ...
68
69 ### Group Size 7
70
71 | Mode | # Regs | Regs/Group | Elms/Reg | Packed/Non-packed SIMD |
72 |------------|--------|------------|----------|--------------------------|
73 | i8x7x*VL* | 1x*VL* | 1 | 7 | Not supported |
74 | i16x7x*VL* | 2x*VL* | 2 | 4 | Not supported |
75 | i32x7x*VL* | 4x*VL* | 4 | 2 | Not supported |
76 | i64x7x*VL* | 7x*VL* | 7 | 1 | i64x7x*VL* (Non-packed)* |
77 | f16x7x*VL* | 2x*VL* | 2 | 4 | Not supported |
78 | f32x7x*VL* | 4x*VL* | 4 | 2 | Not supported |
79 | f64x7x*VL* | 7x*VL* | 7 | 1 | f64x7x*VL* (Non-packed)* |
80
81 \* Not supported unless *VL* is changed
82
83 ### Group Size 8
84
85 | Mode | # Regs | Regs/Group | Elms/Reg | Packed/Non-packed SIMD |
86 |------------|--------|------------|----------|--------------------------|
87 | i8x8x*VL* | 1x*VL* | 1 | 8 | i8x8x*VL* (Packed) |
88 | i16x8x*VL* | 2x*VL* | 2 | 4 | i16x8x*VL* (Packed)* |
89 | i32x8x*VL* | 4x*VL* | 4 | 2 | i32x8x*VL* (Packed)* |
90 | i64x8x*VL* | 8x*VL* | 8 | 1 | i64x8x*VL* (Non-packed)* |
91 | f16x8x*VL* | 2x*VL* | 2 | 4 | f16x8x*VL* (Packed)* |
92 | f32x8x*VL* | 4x*VL* | 4 | 2 | f32x8x*VL* (Packed)* |
93 | f64x8x*VL* | 8x*VL* | 8 | 1 | f64x8x*VL* (Non-packed)* |
94
95 \* Not supported unless *VL* is changed