bug 1048, ls011: Add Fixed Store Shifted Post-Update section
[libreriscv.git] / ISA / power-saving.mdwn
1 Switching off parts of the CPU
2 ==============================
3
4 These are a few notes initially as a result of a discussion with Luke in April 2021.
5
6 It has been updated since and should be regarded as a record of current
7 thinking in a discussion; not any sort of conclusion.
8
9 Overview
10 --------
11
12 The basic ideas are:
13
14 * Few programs will use the Vector Registers or the Vector-Scalar Registers
15
16 * If these could be switched off there will be power saving
17
18 * Power these back up when needed
19
20 * There might be other parts of the CPU that could also be opportunistically switched off
21
22 A bit more detail
23 -----------------
24
25 * How to switch on/off ? Is it done by the hardware or controlled by the operating system (OS) ?
26
27 ** It should be under OS control.
28 The OS already does this sort of thing for: disks, Bluetooth, ...
29 The OS is in best position to know
30
31 *** how much idle time makes a component a candidate for switching off
32
33 *** when not to switch off - eg imminent use expected
34
35 *** user preferences
36
37 ** It has been suggested that a HW switch on would be much faster than the OS doing it
38
39 *** what would be the state of the hardware restarted ? If registers are
40 switched on what are the initial values ? This is something that the OS might
41 want some say in - zero is not always the answer.
42
43 *** Maybe a status bit per hardware component that causes an OS interrupt.
44
45 * What happens when a program tries to use something switched off ?
46
47 ** a hardware exception ought to be raised, in much the same way as when
48 a program trying to use floating point with hardware that does not have floating point.
49 The OS could then switch on and restart the process
50
51
52 Questions
53 ---------
54
55 * How long will it take to start (power up) part of the CPU
56
57 ** I have looked hard to try to get a clue
58
59 ** the best that I can do is [ARM's big.LITTLE](https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/ten-things-to-know-about-big-little)
60 talks about 30-100 microseconds to do process migration and voltage/frequency changes
61
62 ** this is NOT the same as what is being talked about here, but is the best that I can do
63
64 ** Finding how long to, eg, switch on/off a Bluetooth device might give closer estimate as to
65 how long things take
66
67 ** Jacob says: I'd expect the power-on latency could be on the order of a few
68 10s or 100s of nanoseconds, at which scale calling out to the OS greatly
69 increases latency.
70
71 ** Jacob: I think it's faster than voltage/frequency switching because V/F
72 switching usually involves adjusting the power-supply voltage, that takes on
73 the order of microseconds, dwarfing the interrupt-to-OS latency.
74
75 * How much power would be saved ?
76
77 * How to implement this ?
78
79 * What components could we power down ?
80
81 ** Register files, eg Vector Registers
82
83 ** Crypto hardware (I'm not sure if this would be worth it)
84
85 ** Embedded modem, Bluetooth, radio frequency signal processing, Wi-Fi, ...
86
87 ** We must recognise that new components will be wanted in future. What
88 we do must be able to accommodate future uses.
89
90
91
92 Comments
93 --------
94
95 Jacob:
96
97 To help the OS decide when to power-off parts of the cpu, I think we need
98 32-bit saturating counters (16-bit is not enough, 64-bit is overkill,
99 saturating to avoid issues with wrap-around which would happen once a second at
100 4GHz) of the number of clock cycles since the last time that part was last
101 used. The counter is set to 0 when the cpu part is powered-back-on, even if it
102 didn't end up being used (e.g. mis-speculation). The counters *must* be
103 privileged-only, since they form an excellent side-channel for speculative
104 execution due to mis-speculation still being a use of that hardware.
105
106 ADDW: There is interaction with a previously discussed idea about what
107 registers to (not) save/restore on a context switch. Not restoring is much the
108 same as saying that the registers are not used.
109
110 Jacob: We could have a simple OS-controlled compare register for each part
111 where the part is powered-down if the compare register is < the last-use
112 counter, allowing simple HW power management. If the OS wants finer control, it
113 can set the compare register to 0xFFFFFFFF to force-power-on the part, and to
114 something less than the current counter to power-down the part.
115
116 I picked < instead of <= so both:
117 1. 0xFFFFFFFF will never power-down since the counter stops at 0xFFFFFFFF and
118 0xFFFFFFFF is not < 0xFFFFFFFF.
119 2. 0 will still power-on the part if it's in use, since the counter is
120 continuously cleared to 0 while the part is in use, and it remains powered on
121 since 0 is not < 0.
122
123 It might be handy to have a separate register the previous count is copied to
124 when a part is powered-on, allowing the OS to detect edge-cases like the part
125 being used shortly after power-off, allowing the OS to adjust the power-off
126 interval to better optimise for the program's usage patterns.
127
128 There would be one set of those 32-bit registers (maybe combined into 64-bit
129 registers) for each independent power-zone on the cpu core.
130
131 The compare field should be set to some reasonable default on core reset, I'd
132 use 10000 as a reasonable first guess.
133
134
135 ADDW