move singlepipe, multipipe, nmoperator and pipeline.py to nmutil
[ieee754fpu.git] / src / nmutil / singlepipe.py
1 """ Pipeline API. For multi-input and multi-output variants, see multipipe.
2
3 Associated development bugs:
4 * http://bugs.libre-riscv.org/show_bug.cgi?id=64
5 * http://bugs.libre-riscv.org/show_bug.cgi?id=57
6
7 Important: see Stage API (stageapi.py) in combination with below
8
9 RecordBasedStage:
10 ----------------
11
12 A convenience class that takes an input shape, output shape, a
13 "processing" function and an optional "setup" function. Honestly
14 though, there's not much more effort to just... create a class
15 that returns a couple of Records (see ExampleAddRecordStage in
16 examples).
17
18 PassThroughStage:
19 ----------------
20
21 A convenience class that takes a single function as a parameter,
22 that is chain-called to create the exact same input and output spec.
23 It has a process() function that simply returns its input.
24
25 Instances of this class are completely redundant if handed to
26 StageChain, however when passed to UnbufferedPipeline they
27 can be used to introduce a single clock delay.
28
29 ControlBase:
30 -----------
31
32 The base class for pipelines. Contains previous and next ready/valid/data.
33 Also has an extremely useful "connect" function that can be used to
34 connect a chain of pipelines and present the exact same prev/next
35 ready/valid/data API.
36
37 Note: pipelines basically do not become pipelines as such until
38 handed to a derivative of ControlBase. ControlBase itself is *not*
39 strictly considered a pipeline class. Wishbone and AXI4 (master or
40 slave) could be derived from ControlBase, for example.
41 UnbufferedPipeline:
42 ------------------
43
44 A simple stalling clock-synchronised pipeline that has no buffering
45 (unlike BufferedHandshake). Data flows on *every* clock cycle when
46 the conditions are right (this is nominally when the input is valid
47 and the output is ready).
48
49 A stall anywhere along the line will result in a stall back-propagating
50 down the entire chain. The BufferedHandshake by contrast will buffer
51 incoming data, allowing previous stages one clock cycle's grace before
52 also having to stall.
53
54 An advantage of the UnbufferedPipeline over the Buffered one is
55 that the amount of logic needed (number of gates) is greatly
56 reduced (no second set of buffers basically)
57
58 The disadvantage of the UnbufferedPipeline is that the valid/ready
59 logic, if chained together, is *combinatorial*, resulting in
60 progressively larger gate delay.
61
62 PassThroughHandshake:
63 ------------------
64
65 A Control class that introduces a single clock delay, passing its
66 data through unaltered. Unlike RegisterPipeline (which relies
67 on UnbufferedPipeline and PassThroughStage) it handles ready/valid
68 itself.
69
70 RegisterPipeline:
71 ----------------
72
73 A convenience class that, because UnbufferedPipeline introduces a single
74 clock delay, when its stage is a PassThroughStage, it results in a Pipeline
75 stage that, duh, delays its (unmodified) input by one clock cycle.
76
77 BufferedHandshake:
78 ----------------
79
80 nmigen implementation of buffered pipeline stage, based on zipcpu:
81 https://zipcpu.com/blog/2017/08/14/strategies-for-pipelining.html
82
83 this module requires quite a bit of thought to understand how it works
84 (and why it is needed in the first place). reading the above is
85 *strongly* recommended.
86
87 unlike john dawson's IEEE754 FPU STB/ACK signalling, which requires
88 the STB / ACK signals to raise and lower (on separate clocks) before
89 data may proceeed (thus only allowing one piece of data to proceed
90 on *ALTERNATE* cycles), the signalling here is a true pipeline
91 where data will flow on *every* clock when the conditions are right.
92
93 input acceptance conditions are when:
94 * incoming previous-stage strobe (p.valid_i) is HIGH
95 * outgoing previous-stage ready (p.ready_o) is LOW
96
97 output transmission conditions are when:
98 * outgoing next-stage strobe (n.valid_o) is HIGH
99 * outgoing next-stage ready (n.ready_i) is LOW
100
101 the tricky bit is when the input has valid data and the output is not
102 ready to accept it. if it wasn't for the clock synchronisation, it
103 would be possible to tell the input "hey don't send that data, we're
104 not ready". unfortunately, it's not possible to "change the past":
105 the previous stage *has no choice* but to pass on its data.
106
107 therefore, the incoming data *must* be accepted - and stored: that
108 is the responsibility / contract that this stage *must* accept.
109 on the same clock, it's possible to tell the input that it must
110 not send any more data. this is the "stall" condition.
111
112 we now effectively have *two* possible pieces of data to "choose" from:
113 the buffered data, and the incoming data. the decision as to which
114 to process and output is based on whether we are in "stall" or not.
115 i.e. when the next stage is no longer ready, the output comes from
116 the buffer if a stall had previously occurred, otherwise it comes
117 direct from processing the input.
118
119 this allows us to respect a synchronous "travelling STB" with what
120 dan calls a "buffered handshake".
121
122 it's quite a complex state machine!
123
124 SimpleHandshake
125 ---------------
126
127 Synchronised pipeline, Based on:
128 https://github.com/ZipCPU/dbgbus/blob/master/hexbus/rtl/hbdeword.v
129 """
130
131 from nmigen import Signal, Mux, Module, Elaboratable
132 from nmigen.cli import verilog, rtlil
133 from nmigen.hdl.rec import Record
134
135 from queue import Queue
136 import inspect
137
138 from iocontrol import (PrevControl, NextControl, Object, RecordObject)
139 from stageapi import (_spec, StageCls, Stage, StageChain, StageHelper)
140 import nmoperator
141
142
143 class RecordBasedStage(Stage):
144 """ convenience class which provides a Records-based layout.
145 honestly it's a lot easier just to create a direct Records-based
146 class (see ExampleAddRecordStage)
147 """
148 def __init__(self, in_shape, out_shape, processfn, setupfn=None):
149 self.in_shape = in_shape
150 self.out_shape = out_shape
151 self.__process = processfn
152 self.__setup = setupfn
153 def ispec(self): return Record(self.in_shape)
154 def ospec(self): return Record(self.out_shape)
155 def process(seif, i): return self.__process(i)
156 def setup(seif, m, i): return self.__setup(m, i)
157
158
159 class PassThroughStage(StageCls):
160 """ a pass-through stage with its input data spec identical to its output,
161 and "passes through" its data from input to output (does nothing).
162
163 use this basically to explicitly make any data spec Stage-compliant.
164 (many APIs would potentially use a static "wrap" method in e.g.
165 StageCls to achieve a similar effect)
166 """
167 def __init__(self, iospecfn): self.iospecfn = iospecfn
168 def ispec(self): return self.iospecfn()
169 def ospec(self): return self.iospecfn()
170
171
172 class ControlBase(StageHelper, Elaboratable):
173 """ Common functions for Pipeline API. Note: a "pipeline stage" only
174 exists (conceptually) when a ControlBase derivative is handed
175 a Stage (combinatorial block)
176
177 NOTE: ControlBase derives from StageHelper, making it accidentally
178 compliant with the Stage API. Using those functions directly
179 *BYPASSES* a ControlBase instance ready/valid signalling, which
180 clearly should not be done without a really, really good reason.
181 """
182 def __init__(self, stage=None, in_multi=None, stage_ctl=False):
183 """ Base class containing ready/valid/data to previous and next stages
184
185 * p: contains ready/valid to the previous stage
186 * n: contains ready/valid to the next stage
187
188 Except when calling Controlbase.connect(), user must also:
189 * add data_i member to PrevControl (p) and
190 * add data_o member to NextControl (n)
191 Calling ControlBase._new_data is a good way to do that.
192 """
193 StageHelper.__init__(self, stage)
194
195 # set up input and output IO ACK (prev/next ready/valid)
196 self.p = PrevControl(in_multi, stage_ctl)
197 self.n = NextControl(stage_ctl)
198
199 # set up the input and output data
200 if stage is not None:
201 self._new_data("data")
202
203 def _new_data(self, name):
204 """ allocates new data_i and data_o
205 """
206 self.p.data_i, self.n.data_o = self.new_specs(name)
207
208 @property
209 def data_r(self):
210 return self.process(self.p.data_i)
211
212 def connect_to_next(self, nxt):
213 """ helper function to connect to the next stage data/valid/ready.
214 """
215 return self.n.connect_to_next(nxt.p)
216
217 def _connect_in(self, prev):
218 """ internal helper function to connect stage to an input source.
219 do not use to connect stage-to-stage!
220 """
221 return self.p._connect_in(prev.p)
222
223 def _connect_out(self, nxt):
224 """ internal helper function to connect stage to an output source.
225 do not use to connect stage-to-stage!
226 """
227 return self.n._connect_out(nxt.n)
228
229 def connect(self, pipechain):
230 """ connects a chain (list) of Pipeline instances together and
231 links them to this ControlBase instance:
232
233 in <----> self <---> out
234 | ^
235 v |
236 [pipe1, pipe2, pipe3, pipe4]
237 | ^ | ^ | ^
238 v | v | v |
239 out---in out--in out---in
240
241 Also takes care of allocating data_i/data_o, by looking up
242 the data spec for each end of the pipechain. i.e It is NOT
243 necessary to allocate self.p.data_i or self.n.data_o manually:
244 this is handled AUTOMATICALLY, here.
245
246 Basically this function is the direct equivalent of StageChain,
247 except that unlike StageChain, the Pipeline logic is followed.
248
249 Just as StageChain presents an object that conforms to the
250 Stage API from a list of objects that also conform to the
251 Stage API, an object that calls this Pipeline connect function
252 has the exact same pipeline API as the list of pipline objects
253 it is called with.
254
255 Thus it becomes possible to build up larger chains recursively.
256 More complex chains (multi-input, multi-output) will have to be
257 done manually.
258
259 Argument:
260
261 * :pipechain: - a sequence of ControlBase-derived classes
262 (must be one or more in length)
263
264 Returns:
265
266 * a list of eq assignments that will need to be added in
267 an elaborate() to m.d.comb
268 """
269 assert len(pipechain) > 0, "pipechain must be non-zero length"
270 assert self.stage is None, "do not use connect with a stage"
271 eqs = [] # collated list of assignment statements
272
273 # connect inter-chain
274 for i in range(len(pipechain)-1):
275 pipe1 = pipechain[i] # earlier
276 pipe2 = pipechain[i+1] # later (by 1)
277 eqs += pipe1.connect_to_next(pipe2) # earlier n to later p
278
279 # connect front and back of chain to ourselves
280 front = pipechain[0] # first in chain
281 end = pipechain[-1] # last in chain
282 self.set_specs(front, end) # sets up ispec/ospec functions
283 self._new_data("chain") # NOTE: REPLACES existing data
284 eqs += front._connect_in(self) # front p to our p
285 eqs += end._connect_out(self) # end n to our n
286
287 return eqs
288
289 def set_input(self, i):
290 """ helper function to set the input data (used in unit tests)
291 """
292 return nmoperator.eq(self.p.data_i, i)
293
294 def __iter__(self):
295 yield from self.p # yields ready/valid/data (data also gets yielded)
296 yield from self.n # ditto
297
298 def ports(self):
299 return list(self)
300
301 def elaborate(self, platform):
302 """ handles case where stage has dynamic ready/valid functions
303 """
304 m = Module()
305 m.submodules.p = self.p
306 m.submodules.n = self.n
307
308 self.setup(m, self.p.data_i)
309
310 if not self.p.stage_ctl:
311 return m
312
313 # intercept the previous (outgoing) "ready", combine with stage ready
314 m.d.comb += self.p.s_ready_o.eq(self.p._ready_o & self.stage.d_ready)
315
316 # intercept the next (incoming) "ready" and combine it with data valid
317 sdv = self.stage.d_valid(self.n.ready_i)
318 m.d.comb += self.n.d_valid.eq(self.n.ready_i & sdv)
319
320 return m
321
322
323 class BufferedHandshake(ControlBase):
324 """ buffered pipeline stage. data and strobe signals travel in sync.
325 if ever the input is ready and the output is not, processed data
326 is shunted in a temporary register.
327
328 Argument: stage. see Stage API above
329
330 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
331 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
332 stage-1 p.data_i >>in stage n.data_o out>> stage+1
333 | |
334 process --->----^
335 | |
336 +-- r_data ->-+
337
338 input data p.data_i is read (only), is processed and goes into an
339 intermediate result store [process()]. this is updated combinatorially.
340
341 in a non-stall condition, the intermediate result will go into the
342 output (update_output). however if ever there is a stall, it goes
343 into r_data instead [update_buffer()].
344
345 when the non-stall condition is released, r_data is the first
346 to be transferred to the output [flush_buffer()], and the stall
347 condition cleared.
348
349 on the next cycle (as long as stall is not raised again) the
350 input may begin to be processed and transferred directly to output.
351 """
352
353 def elaborate(self, platform):
354 self.m = ControlBase.elaborate(self, platform)
355
356 result = _spec(self.stage.ospec, "r_tmp")
357 r_data = _spec(self.stage.ospec, "r_data")
358
359 # establish some combinatorial temporaries
360 o_n_validn = Signal(reset_less=True)
361 n_ready_i = Signal(reset_less=True, name="n_i_rdy_data")
362 nir_por = Signal(reset_less=True)
363 nir_por_n = Signal(reset_less=True)
364 p_valid_i = Signal(reset_less=True)
365 nir_novn = Signal(reset_less=True)
366 nirn_novn = Signal(reset_less=True)
367 por_pivn = Signal(reset_less=True)
368 npnn = Signal(reset_less=True)
369 self.m.d.comb += [p_valid_i.eq(self.p.valid_i_test),
370 o_n_validn.eq(~self.n.valid_o),
371 n_ready_i.eq(self.n.ready_i_test),
372 nir_por.eq(n_ready_i & self.p._ready_o),
373 nir_por_n.eq(n_ready_i & ~self.p._ready_o),
374 nir_novn.eq(n_ready_i | o_n_validn),
375 nirn_novn.eq(~n_ready_i & o_n_validn),
376 npnn.eq(nir_por | nirn_novn),
377 por_pivn.eq(self.p._ready_o & ~p_valid_i)
378 ]
379
380 # store result of processing in combinatorial temporary
381 self.m.d.comb += nmoperator.eq(result, self.data_r)
382
383 # if not in stall condition, update the temporary register
384 with self.m.If(self.p.ready_o): # not stalled
385 self.m.d.sync += nmoperator.eq(r_data, result) # update buffer
386
387 # data pass-through conditions
388 with self.m.If(npnn):
389 data_o = self._postprocess(result) # XXX TBD, does nothing right now
390 self.m.d.sync += [self.n.valid_o.eq(p_valid_i), # valid if p_valid
391 nmoperator.eq(self.n.data_o, data_o), # update out
392 ]
393 # buffer flush conditions (NOTE: can override data passthru conditions)
394 with self.m.If(nir_por_n): # not stalled
395 # Flush the [already processed] buffer to the output port.
396 data_o = self._postprocess(r_data) # XXX TBD, does nothing right now
397 self.m.d.sync += [self.n.valid_o.eq(1), # reg empty
398 nmoperator.eq(self.n.data_o, data_o), # flush
399 ]
400 # output ready conditions
401 self.m.d.sync += self.p._ready_o.eq(nir_novn | por_pivn)
402
403 return self.m
404
405
406 class SimpleHandshake(ControlBase):
407 """ simple handshake control. data and strobe signals travel in sync.
408 implements the protocol used by Wishbone and AXI4.
409
410 Argument: stage. see Stage API above
411
412 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
413 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
414 stage-1 p.data_i >>in stage n.data_o out>> stage+1
415 | |
416 +--process->--^
417 Truth Table
418
419 Inputs Temporary Output Data
420 ------- ---------- ----- ----
421 P P N N PiV& ~NiR& N P
422 i o i o PoR NoV o o
423 V R R V V R
424
425 ------- - - - -
426 0 0 0 0 0 0 >0 0 reg
427 0 0 0 1 0 1 >1 0 reg
428 0 0 1 0 0 0 0 1 process(data_i)
429 0 0 1 1 0 0 0 1 process(data_i)
430 ------- - - - -
431 0 1 0 0 0 0 >0 0 reg
432 0 1 0 1 0 1 >1 0 reg
433 0 1 1 0 0 0 0 1 process(data_i)
434 0 1 1 1 0 0 0 1 process(data_i)
435 ------- - - - -
436 1 0 0 0 0 0 >0 0 reg
437 1 0 0 1 0 1 >1 0 reg
438 1 0 1 0 0 0 0 1 process(data_i)
439 1 0 1 1 0 0 0 1 process(data_i)
440 ------- - - - -
441 1 1 0 0 1 0 1 0 process(data_i)
442 1 1 0 1 1 1 1 0 process(data_i)
443 1 1 1 0 1 0 1 1 process(data_i)
444 1 1 1 1 1 0 1 1 process(data_i)
445 ------- - - - -
446 """
447
448 def elaborate(self, platform):
449 self.m = m = ControlBase.elaborate(self, platform)
450
451 r_busy = Signal()
452 result = _spec(self.stage.ospec, "r_tmp")
453
454 # establish some combinatorial temporaries
455 n_ready_i = Signal(reset_less=True, name="n_i_rdy_data")
456 p_valid_i_p_ready_o = Signal(reset_less=True)
457 p_valid_i = Signal(reset_less=True)
458 m.d.comb += [p_valid_i.eq(self.p.valid_i_test),
459 n_ready_i.eq(self.n.ready_i_test),
460 p_valid_i_p_ready_o.eq(p_valid_i & self.p.ready_o),
461 ]
462
463 # store result of processing in combinatorial temporary
464 m.d.comb += nmoperator.eq(result, self.data_r)
465
466 # previous valid and ready
467 with m.If(p_valid_i_p_ready_o):
468 data_o = self._postprocess(result) # XXX TBD, does nothing right now
469 m.d.sync += [r_busy.eq(1), # output valid
470 nmoperator.eq(self.n.data_o, data_o), # update output
471 ]
472 # previous invalid or not ready, however next is accepting
473 with m.Elif(n_ready_i):
474 data_o = self._postprocess(result) # XXX TBD, does nothing right now
475 m.d.sync += [nmoperator.eq(self.n.data_o, data_o)]
476 # TODO: could still send data here (if there was any)
477 #m.d.sync += self.n.valid_o.eq(0) # ...so set output invalid
478 m.d.sync += r_busy.eq(0) # ...so set output invalid
479
480 m.d.comb += self.n.valid_o.eq(r_busy)
481 # if next is ready, so is previous
482 m.d.comb += self.p._ready_o.eq(n_ready_i)
483
484 return self.m
485
486
487 class UnbufferedPipeline(ControlBase):
488 """ A simple pipeline stage with single-clock synchronisation
489 and two-way valid/ready synchronised signalling.
490
491 Note that a stall in one stage will result in the entire pipeline
492 chain stalling.
493
494 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
495 travel synchronously with the data: the valid/ready signalling
496 combines in a *combinatorial* fashion. Therefore, a long pipeline
497 chain will lengthen propagation delays.
498
499 Argument: stage. see Stage API, above
500
501 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
502 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
503 stage-1 p.data_i >>in stage n.data_o out>> stage+1
504 | |
505 r_data result
506 | |
507 +--process ->-+
508
509 Attributes:
510 -----------
511 p.data_i : StageInput, shaped according to ispec
512 The pipeline input
513 p.data_o : StageOutput, shaped according to ospec
514 The pipeline output
515 r_data : input_shape according to ispec
516 A temporary (buffered) copy of a prior (valid) input.
517 This is HELD if the output is not ready. It is updated
518 SYNCHRONOUSLY.
519 result: output_shape according to ospec
520 The output of the combinatorial logic. it is updated
521 COMBINATORIALLY (no clock dependence).
522
523 Truth Table
524
525 Inputs Temp Output Data
526 ------- - ----- ----
527 P P N N ~NiR& N P
528 i o i o NoV o o
529 V R R V V R
530
531 ------- - - -
532 0 0 0 0 0 0 1 reg
533 0 0 0 1 1 1 0 reg
534 0 0 1 0 0 0 1 reg
535 0 0 1 1 0 0 1 reg
536 ------- - - -
537 0 1 0 0 0 0 1 reg
538 0 1 0 1 1 1 0 reg
539 0 1 1 0 0 0 1 reg
540 0 1 1 1 0 0 1 reg
541 ------- - - -
542 1 0 0 0 0 1 1 reg
543 1 0 0 1 1 1 0 reg
544 1 0 1 0 0 1 1 reg
545 1 0 1 1 0 1 1 reg
546 ------- - - -
547 1 1 0 0 0 1 1 process(data_i)
548 1 1 0 1 1 1 0 process(data_i)
549 1 1 1 0 0 1 1 process(data_i)
550 1 1 1 1 0 1 1 process(data_i)
551 ------- - - -
552
553 Note: PoR is *NOT* involved in the above decision-making.
554 """
555
556 def elaborate(self, platform):
557 self.m = m = ControlBase.elaborate(self, platform)
558
559 data_valid = Signal() # is data valid or not
560 r_data = _spec(self.stage.ospec, "r_tmp") # output type
561
562 # some temporaries
563 p_valid_i = Signal(reset_less=True)
564 pv = Signal(reset_less=True)
565 buf_full = Signal(reset_less=True)
566 m.d.comb += p_valid_i.eq(self.p.valid_i_test)
567 m.d.comb += pv.eq(self.p.valid_i & self.p.ready_o)
568 m.d.comb += buf_full.eq(~self.n.ready_i_test & data_valid)
569
570 m.d.comb += self.n.valid_o.eq(data_valid)
571 m.d.comb += self.p._ready_o.eq(~data_valid | self.n.ready_i_test)
572 m.d.sync += data_valid.eq(p_valid_i | buf_full)
573
574 with m.If(pv):
575 m.d.sync += nmoperator.eq(r_data, self.data_r)
576 data_o = self._postprocess(r_data) # XXX TBD, does nothing right now
577 m.d.comb += nmoperator.eq(self.n.data_o, data_o)
578
579 return self.m
580
581 class UnbufferedPipeline2(ControlBase):
582 """ A simple pipeline stage with single-clock synchronisation
583 and two-way valid/ready synchronised signalling.
584
585 Note that a stall in one stage will result in the entire pipeline
586 chain stalling.
587
588 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
589 travel synchronously with the data: the valid/ready signalling
590 combines in a *combinatorial* fashion. Therefore, a long pipeline
591 chain will lengthen propagation delays.
592
593 Argument: stage. see Stage API, above
594
595 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
596 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
597 stage-1 p.data_i >>in stage n.data_o out>> stage+1
598 | | |
599 +- process-> buf <-+
600 Attributes:
601 -----------
602 p.data_i : StageInput, shaped according to ispec
603 The pipeline input
604 p.data_o : StageOutput, shaped according to ospec
605 The pipeline output
606 buf : output_shape according to ospec
607 A temporary (buffered) copy of a valid output
608 This is HELD if the output is not ready. It is updated
609 SYNCHRONOUSLY.
610
611 Inputs Temp Output Data
612 ------- - -----
613 P P N N ~NiR& N P (buf_full)
614 i o i o NoV o o
615 V R R V V R
616
617 ------- - - -
618 0 0 0 0 0 0 1 process(data_i)
619 0 0 0 1 1 1 0 reg (odata, unchanged)
620 0 0 1 0 0 0 1 process(data_i)
621 0 0 1 1 0 0 1 process(data_i)
622 ------- - - -
623 0 1 0 0 0 0 1 process(data_i)
624 0 1 0 1 1 1 0 reg (odata, unchanged)
625 0 1 1 0 0 0 1 process(data_i)
626 0 1 1 1 0 0 1 process(data_i)
627 ------- - - -
628 1 0 0 0 0 1 1 process(data_i)
629 1 0 0 1 1 1 0 reg (odata, unchanged)
630 1 0 1 0 0 1 1 process(data_i)
631 1 0 1 1 0 1 1 process(data_i)
632 ------- - - -
633 1 1 0 0 0 1 1 process(data_i)
634 1 1 0 1 1 1 0 reg (odata, unchanged)
635 1 1 1 0 0 1 1 process(data_i)
636 1 1 1 1 0 1 1 process(data_i)
637 ------- - - -
638
639 Note: PoR is *NOT* involved in the above decision-making.
640 """
641
642 def elaborate(self, platform):
643 self.m = m = ControlBase.elaborate(self, platform)
644
645 buf_full = Signal() # is data valid or not
646 buf = _spec(self.stage.ospec, "r_tmp") # output type
647
648 # some temporaries
649 p_valid_i = Signal(reset_less=True)
650 m.d.comb += p_valid_i.eq(self.p.valid_i_test)
651
652 m.d.comb += self.n.valid_o.eq(buf_full | p_valid_i)
653 m.d.comb += self.p._ready_o.eq(~buf_full)
654 m.d.sync += buf_full.eq(~self.n.ready_i_test & self.n.valid_o)
655
656 data_o = Mux(buf_full, buf, self.data_r)
657 data_o = self._postprocess(data_o) # XXX TBD, does nothing right now
658 m.d.comb += nmoperator.eq(self.n.data_o, data_o)
659 m.d.sync += nmoperator.eq(buf, self.n.data_o)
660
661 return self.m
662
663
664 class PassThroughHandshake(ControlBase):
665 """ A control block that delays by one clock cycle.
666
667 Inputs Temporary Output Data
668 ------- ------------------ ----- ----
669 P P N N PiV& PiV| NiR| pvr N P (pvr)
670 i o i o PoR ~PoR ~NoV o o
671 V R R V V R
672
673 ------- - - - - - -
674 0 0 0 0 0 1 1 0 1 1 odata (unchanged)
675 0 0 0 1 0 1 0 0 1 0 odata (unchanged)
676 0 0 1 0 0 1 1 0 1 1 odata (unchanged)
677 0 0 1 1 0 1 1 0 1 1 odata (unchanged)
678 ------- - - - - - -
679 0 1 0 0 0 0 1 0 0 1 odata (unchanged)
680 0 1 0 1 0 0 0 0 0 0 odata (unchanged)
681 0 1 1 0 0 0 1 0 0 1 odata (unchanged)
682 0 1 1 1 0 0 1 0 0 1 odata (unchanged)
683 ------- - - - - - -
684 1 0 0 0 0 1 1 1 1 1 process(in)
685 1 0 0 1 0 1 0 0 1 0 odata (unchanged)
686 1 0 1 0 0 1 1 1 1 1 process(in)
687 1 0 1 1 0 1 1 1 1 1 process(in)
688 ------- - - - - - -
689 1 1 0 0 1 1 1 1 1 1 process(in)
690 1 1 0 1 1 1 0 0 1 0 odata (unchanged)
691 1 1 1 0 1 1 1 1 1 1 process(in)
692 1 1 1 1 1 1 1 1 1 1 process(in)
693 ------- - - - - - -
694
695 """
696
697 def elaborate(self, platform):
698 self.m = m = ControlBase.elaborate(self, platform)
699
700 r_data = _spec(self.stage.ospec, "r_tmp") # output type
701
702 # temporaries
703 p_valid_i = Signal(reset_less=True)
704 pvr = Signal(reset_less=True)
705 m.d.comb += p_valid_i.eq(self.p.valid_i_test)
706 m.d.comb += pvr.eq(p_valid_i & self.p.ready_o)
707
708 m.d.comb += self.p.ready_o.eq(~self.n.valid_o | self.n.ready_i_test)
709 m.d.sync += self.n.valid_o.eq(p_valid_i | ~self.p.ready_o)
710
711 odata = Mux(pvr, self.data_r, r_data)
712 m.d.sync += nmoperator.eq(r_data, odata)
713 r_data = self._postprocess(r_data) # XXX TBD, does nothing right now
714 m.d.comb += nmoperator.eq(self.n.data_o, r_data)
715
716 return m
717
718
719 class RegisterPipeline(UnbufferedPipeline):
720 """ A pipeline stage that delays by one clock cycle, creating a
721 sync'd latch out of data_o and valid_o as an indirect byproduct
722 of using PassThroughStage
723 """
724 def __init__(self, iospecfn):
725 UnbufferedPipeline.__init__(self, PassThroughStage(iospecfn))
726
727
728 class FIFOControl(ControlBase):
729 """ FIFO Control. Uses Queue to store data, coincidentally
730 happens to have same valid/ready signalling as Stage API.
731
732 data_i -> fifo.din -> FIFO -> fifo.dout -> data_o
733 """
734 def __init__(self, depth, stage, in_multi=None, stage_ctl=False,
735 fwft=True, pipe=False):
736 """ FIFO Control
737
738 * :depth: number of entries in the FIFO
739 * :stage: data processing block
740 * :fwft: first word fall-thru mode (non-fwft introduces delay)
741 * :pipe: specifies pipe mode.
742
743 when fwft = True it indicates that transfers may occur
744 combinatorially through stage processing in the same clock cycle.
745 This requires that the Stage be a Moore FSM:
746 https://en.wikipedia.org/wiki/Moore_machine
747
748 when fwft = False it indicates that all output signals are
749 produced only from internal registers or memory, i.e. that the
750 Stage is a Mealy FSM:
751 https://en.wikipedia.org/wiki/Mealy_machine
752
753 data is processed (and located) as follows:
754
755 self.p self.stage temp fn temp fn temp fp self.n
756 data_i->process()->result->cat->din.FIFO.dout->cat(data_o)
757
758 yes, really: cat produces a Cat() which can be assigned to.
759 this is how the FIFO gets de-catted without needing a de-cat
760 function
761 """
762 self.fwft = fwft
763 self.pipe = pipe
764 self.fdepth = depth
765 ControlBase.__init__(self, stage, in_multi, stage_ctl)
766
767 def elaborate(self, platform):
768 self.m = m = ControlBase.elaborate(self, platform)
769
770 # make a FIFO with a signal of equal width to the data_o.
771 (fwidth, _) = nmoperator.shape(self.n.data_o)
772 fifo = Queue(fwidth, self.fdepth, fwft=self.fwft, pipe=self.pipe)
773 m.submodules.fifo = fifo
774
775 def processfn(data_i):
776 # store result of processing in combinatorial temporary
777 result = _spec(self.stage.ospec, "r_temp")
778 m.d.comb += nmoperator.eq(result, self.process(data_i))
779 return nmoperator.cat(result)
780
781 ## prev: make the FIFO (Queue object) "look" like a PrevControl...
782 m.submodules.fp = fp = PrevControl()
783 fp.valid_i, fp._ready_o, fp.data_i = fifo.we, fifo.writable, fifo.din
784 m.d.comb += fp._connect_in(self.p, fn=processfn)
785
786 # next: make the FIFO (Queue object) "look" like a NextControl...
787 m.submodules.fn = fn = NextControl()
788 fn.valid_o, fn.ready_i, fn.data_o = fifo.readable, fifo.re, fifo.dout
789 connections = fn._connect_out(self.n, fn=nmoperator.cat)
790
791 # ok ok so we can't just do the ready/valid eqs straight:
792 # first 2 from connections are the ready/valid, 3rd is data.
793 if self.fwft:
794 m.d.comb += connections[:2] # combinatorial on next ready/valid
795 else:
796 m.d.sync += connections[:2] # non-fwft mode needs sync
797 data_o = connections[2] # get the data
798 data_o = self._postprocess(data_o) # XXX TBD, does nothing right now
799 m.d.comb += data_o
800
801 return m
802
803
804 # aka "RegStage".
805 class UnbufferedPipeline(FIFOControl):
806 def __init__(self, stage, in_multi=None, stage_ctl=False):
807 FIFOControl.__init__(self, 1, stage, in_multi, stage_ctl,
808 fwft=True, pipe=False)
809
810 # aka "BreakReadyStage" XXX had to set fwft=True to get it to work
811 class PassThroughHandshake(FIFOControl):
812 def __init__(self, stage, in_multi=None, stage_ctl=False):
813 FIFOControl.__init__(self, 1, stage, in_multi, stage_ctl,
814 fwft=True, pipe=True)
815
816 # this is *probably* BufferedHandshake, although test #997 now succeeds.
817 class BufferedHandshake(FIFOControl):
818 def __init__(self, stage, in_multi=None, stage_ctl=False):
819 FIFOControl.__init__(self, 2, stage, in_multi, stage_ctl,
820 fwft=True, pipe=False)
821
822
823 """
824 # this is *probably* SimpleHandshake (note: memory cell size=0)
825 class SimpleHandshake(FIFOControl):
826 def __init__(self, stage, in_multi=None, stage_ctl=False):
827 FIFOControl.__init__(self, 0, stage, in_multi, stage_ctl,
828 fwft=True, pipe=False)
829 """