1 """ Pipeline API. For multi-input and multi-output variants, see multipipe.
3 Associated development bugs:
4 * http://bugs.libre-riscv.org/show_bug.cgi?id=64
5 * http://bugs.libre-riscv.org/show_bug.cgi?id=57
7 Important: see Stage API (stageapi.py) in combination with below
12 A convenience class that takes an input shape, output shape, a
13 "processing" function and an optional "setup" function. Honestly
14 though, there's not much more effort to just... create a class
15 that returns a couple of Records (see ExampleAddRecordStage in
21 A convenience class that takes a single function as a parameter,
22 that is chain-called to create the exact same input and output spec.
23 It has a process() function that simply returns its input.
25 Instances of this class are completely redundant if handed to
26 StageChain, however when passed to UnbufferedPipeline they
27 can be used to introduce a single clock delay.
32 The base class for pipelines. Contains previous and next ready/valid/data.
33 Also has an extremely useful "connect" function that can be used to
34 connect a chain of pipelines and present the exact same prev/next
37 Note: pipelines basically do not become pipelines as such until
38 handed to a derivative of ControlBase. ControlBase itself is *not*
39 strictly considered a pipeline class. Wishbone and AXI4 (master or
40 slave) could be derived from ControlBase, for example.
44 A simple stalling clock-synchronised pipeline that has no buffering
45 (unlike BufferedHandshake). Data flows on *every* clock cycle when
46 the conditions are right (this is nominally when the input is valid
47 and the output is ready).
49 A stall anywhere along the line will result in a stall back-propagating
50 down the entire chain. The BufferedHandshake by contrast will buffer
51 incoming data, allowing previous stages one clock cycle's grace before
54 An advantage of the UnbufferedPipeline over the Buffered one is
55 that the amount of logic needed (number of gates) is greatly
56 reduced (no second set of buffers basically)
58 The disadvantage of the UnbufferedPipeline is that the valid/ready
59 logic, if chained together, is *combinatorial*, resulting in
60 progressively larger gate delay.
65 A Control class that introduces a single clock delay, passing its
66 data through unaltered. Unlike RegisterPipeline (which relies
67 on UnbufferedPipeline and PassThroughStage) it handles ready/valid
73 A convenience class that, because UnbufferedPipeline introduces a single
74 clock delay, when its stage is a PassThroughStage, it results in a Pipeline
75 stage that, duh, delays its (unmodified) input by one clock cycle.
80 nmigen implementation of buffered pipeline stage, based on zipcpu:
81 https://zipcpu.com/blog/2017/08/14/strategies-for-pipelining.html
83 this module requires quite a bit of thought to understand how it works
84 (and why it is needed in the first place). reading the above is
85 *strongly* recommended.
87 unlike john dawson's IEEE754 FPU STB/ACK signalling, which requires
88 the STB / ACK signals to raise and lower (on separate clocks) before
89 data may proceeed (thus only allowing one piece of data to proceed
90 on *ALTERNATE* cycles), the signalling here is a true pipeline
91 where data will flow on *every* clock when the conditions are right.
93 input acceptance conditions are when:
94 * incoming previous-stage strobe (p.valid_i) is HIGH
95 * outgoing previous-stage ready (p.ready_o) is LOW
97 output transmission conditions are when:
98 * outgoing next-stage strobe (n.valid_o) is HIGH
99 * outgoing next-stage ready (n.ready_i) is LOW
101 the tricky bit is when the input has valid data and the output is not
102 ready to accept it. if it wasn't for the clock synchronisation, it
103 would be possible to tell the input "hey don't send that data, we're
104 not ready". unfortunately, it's not possible to "change the past":
105 the previous stage *has no choice* but to pass on its data.
107 therefore, the incoming data *must* be accepted - and stored: that
108 is the responsibility / contract that this stage *must* accept.
109 on the same clock, it's possible to tell the input that it must
110 not send any more data. this is the "stall" condition.
112 we now effectively have *two* possible pieces of data to "choose" from:
113 the buffered data, and the incoming data. the decision as to which
114 to process and output is based on whether we are in "stall" or not.
115 i.e. when the next stage is no longer ready, the output comes from
116 the buffer if a stall had previously occurred, otherwise it comes
117 direct from processing the input.
119 this allows us to respect a synchronous "travelling STB" with what
120 dan calls a "buffered handshake".
122 it's quite a complex state machine!
127 Synchronised pipeline, Based on:
128 https://github.com/ZipCPU/dbgbus/blob/master/hexbus/rtl/hbdeword.v
131 from nmigen
import Signal
, Mux
, Module
, Elaboratable
, Const
132 from nmigen
.cli
import verilog
, rtlil
133 from nmigen
.hdl
.rec
import Record
135 from nmutil
.queue
import Queue
138 from nmutil
.iocontrol
import (PrevControl
, NextControl
, Object
, RecordObject
)
139 from nmutil
.stageapi
import (_spec
, StageCls
, Stage
, StageChain
, StageHelper
)
140 from nmutil
import nmoperator
143 class RecordBasedStage(Stage
):
144 """ convenience class which provides a Records-based layout.
145 honestly it's a lot easier just to create a direct Records-based
146 class (see ExampleAddRecordStage)
148 def __init__(self
, in_shape
, out_shape
, processfn
, setupfn
=None):
149 self
.in_shape
= in_shape
150 self
.out_shape
= out_shape
151 self
.__process
= processfn
152 self
.__setup
= setupfn
153 def ispec(self
): return Record(self
.in_shape
)
154 def ospec(self
): return Record(self
.out_shape
)
155 def process(seif
, i
): return self
.__process
(i
)
156 def setup(seif
, m
, i
): return self
.__setup
(m
, i
)
159 class PassThroughStage(StageCls
):
160 """ a pass-through stage with its input data spec identical to its output,
161 and "passes through" its data from input to output (does nothing).
163 use this basically to explicitly make any data spec Stage-compliant.
164 (many APIs would potentially use a static "wrap" method in e.g.
165 StageCls to achieve a similar effect)
167 def __init__(self
, iospecfn
): self
.iospecfn
= iospecfn
168 def ispec(self
): return self
.iospecfn()
169 def ospec(self
): return self
.iospecfn()
172 class ControlBase(StageHelper
, Elaboratable
):
173 """ Common functions for Pipeline API. Note: a "pipeline stage" only
174 exists (conceptually) when a ControlBase derivative is handed
175 a Stage (combinatorial block)
177 NOTE: ControlBase derives from StageHelper, making it accidentally
178 compliant with the Stage API. Using those functions directly
179 *BYPASSES* a ControlBase instance ready/valid signalling, which
180 clearly should not be done without a really, really good reason.
182 def __init__(self
, stage
=None, in_multi
=None, stage_ctl
=False, maskwid
=0):
183 """ Base class containing ready/valid/data to previous and next stages
185 * p: contains ready/valid to the previous stage
186 * n: contains ready/valid to the next stage
188 Except when calling Controlbase.connect(), user must also:
189 * add data_i member to PrevControl (p) and
190 * add data_o member to NextControl (n)
191 Calling ControlBase._new_data is a good way to do that.
193 print ("ControlBase", self
, stage
, in_multi
, stage_ctl
)
194 StageHelper
.__init
__(self
, stage
)
196 # set up input and output IO ACK (prev/next ready/valid)
197 self
.p
= PrevControl(in_multi
, stage_ctl
, maskwid
=maskwid
)
198 self
.n
= NextControl(stage_ctl
, maskwid
=maskwid
)
200 # set up the input and output data
201 if stage
is not None:
202 self
._new
_data
("data")
204 def _new_data(self
, name
):
205 """ allocates new data_i and data_o
207 self
.p
.data_i
, self
.n
.data_o
= self
.new_specs(name
)
211 return self
.process(self
.p
.data_i
)
213 def connect_to_next(self
, nxt
):
214 """ helper function to connect to the next stage data/valid/ready.
216 return self
.n
.connect_to_next(nxt
.p
)
218 def _connect_in(self
, prev
):
219 """ internal helper function to connect stage to an input source.
220 do not use to connect stage-to-stage!
222 return self
.p
._connect
_in
(prev
.p
)
224 def _connect_out(self
, nxt
):
225 """ internal helper function to connect stage to an output source.
226 do not use to connect stage-to-stage!
228 return self
.n
._connect
_out
(nxt
.n
)
230 def connect(self
, pipechain
):
231 """ connects a chain (list) of Pipeline instances together and
232 links them to this ControlBase instance:
234 in <----> self <---> out
237 [pipe1, pipe2, pipe3, pipe4]
240 out---in out--in out---in
242 Also takes care of allocating data_i/data_o, by looking up
243 the data spec for each end of the pipechain. i.e It is NOT
244 necessary to allocate self.p.data_i or self.n.data_o manually:
245 this is handled AUTOMATICALLY, here.
247 Basically this function is the direct equivalent of StageChain,
248 except that unlike StageChain, the Pipeline logic is followed.
250 Just as StageChain presents an object that conforms to the
251 Stage API from a list of objects that also conform to the
252 Stage API, an object that calls this Pipeline connect function
253 has the exact same pipeline API as the list of pipline objects
256 Thus it becomes possible to build up larger chains recursively.
257 More complex chains (multi-input, multi-output) will have to be
262 * :pipechain: - a sequence of ControlBase-derived classes
263 (must be one or more in length)
267 * a list of eq assignments that will need to be added in
268 an elaborate() to m.d.comb
270 assert len(pipechain
) > 0, "pipechain must be non-zero length"
271 assert self
.stage
is None, "do not use connect with a stage"
272 eqs
= [] # collated list of assignment statements
274 # connect inter-chain
275 for i
in range(len(pipechain
)-1):
276 pipe1
= pipechain
[i
] # earlier
277 pipe2
= pipechain
[i
+1] # later (by 1)
278 eqs
+= pipe1
.connect_to_next(pipe2
) # earlier n to later p
280 # connect front and back of chain to ourselves
281 front
= pipechain
[0] # first in chain
282 end
= pipechain
[-1] # last in chain
283 self
.set_specs(front
, end
) # sets up ispec/ospec functions
284 self
._new
_data
("chain") # NOTE: REPLACES existing data
285 eqs
+= front
._connect
_in
(self
) # front p to our p
286 eqs
+= end
._connect
_out
(self
) # end n to our n
290 def set_input(self
, i
):
291 """ helper function to set the input data (used in unit tests)
293 return nmoperator
.eq(self
.p
.data_i
, i
)
296 yield from self
.p
# yields ready/valid/data (data also gets yielded)
297 yield from self
.n
# ditto
302 def elaborate(self
, platform
):
303 """ handles case where stage has dynamic ready/valid functions
306 m
.submodules
.p
= self
.p
307 m
.submodules
.n
= self
.n
309 self
.setup(m
, self
.p
.data_i
)
311 if not self
.p
.stage_ctl
:
314 # intercept the previous (outgoing) "ready", combine with stage ready
315 m
.d
.comb
+= self
.p
.s_ready_o
.eq(self
.p
._ready
_o
& self
.stage
.d_ready
)
317 # intercept the next (incoming) "ready" and combine it with data valid
318 sdv
= self
.stage
.d_valid(self
.n
.ready_i
)
319 m
.d
.comb
+= self
.n
.d_valid
.eq(self
.n
.ready_i
& sdv
)
324 class BufferedHandshake(ControlBase
):
325 """ buffered pipeline stage. data and strobe signals travel in sync.
326 if ever the input is ready and the output is not, processed data
327 is shunted in a temporary register.
329 Argument: stage. see Stage API above
331 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
332 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
333 stage-1 p.data_i >>in stage n.data_o out>> stage+1
339 input data p.data_i is read (only), is processed and goes into an
340 intermediate result store [process()]. this is updated combinatorially.
342 in a non-stall condition, the intermediate result will go into the
343 output (update_output). however if ever there is a stall, it goes
344 into r_data instead [update_buffer()].
346 when the non-stall condition is released, r_data is the first
347 to be transferred to the output [flush_buffer()], and the stall
350 on the next cycle (as long as stall is not raised again) the
351 input may begin to be processed and transferred directly to output.
354 def elaborate(self
, platform
):
355 self
.m
= ControlBase
.elaborate(self
, platform
)
357 result
= _spec(self
.stage
.ospec
, "r_tmp")
358 r_data
= _spec(self
.stage
.ospec
, "r_data")
360 # establish some combinatorial temporaries
361 o_n_validn
= Signal(reset_less
=True)
362 n_ready_i
= Signal(reset_less
=True, name
="n_i_rdy_data")
363 nir_por
= Signal(reset_less
=True)
364 nir_por_n
= Signal(reset_less
=True)
365 p_valid_i
= Signal(reset_less
=True)
366 nir_novn
= Signal(reset_less
=True)
367 nirn_novn
= Signal(reset_less
=True)
368 por_pivn
= Signal(reset_less
=True)
369 npnn
= Signal(reset_less
=True)
370 self
.m
.d
.comb
+= [p_valid_i
.eq(self
.p
.valid_i_test
),
371 o_n_validn
.eq(~self
.n
.valid_o
),
372 n_ready_i
.eq(self
.n
.ready_i_test
),
373 nir_por
.eq(n_ready_i
& self
.p
._ready
_o
),
374 nir_por_n
.eq(n_ready_i
& ~self
.p
._ready
_o
),
375 nir_novn
.eq(n_ready_i | o_n_validn
),
376 nirn_novn
.eq(~n_ready_i
& o_n_validn
),
377 npnn
.eq(nir_por | nirn_novn
),
378 por_pivn
.eq(self
.p
._ready
_o
& ~p_valid_i
)
381 # store result of processing in combinatorial temporary
382 self
.m
.d
.comb
+= nmoperator
.eq(result
, self
.data_r
)
384 # if not in stall condition, update the temporary register
385 with self
.m
.If(self
.p
.ready_o
): # not stalled
386 self
.m
.d
.sync
+= nmoperator
.eq(r_data
, result
) # update buffer
388 # data pass-through conditions
389 with self
.m
.If(npnn
):
390 data_o
= self
._postprocess
(result
) # XXX TBD, does nothing right now
391 self
.m
.d
.sync
+= [self
.n
.valid_o
.eq(p_valid_i
), # valid if p_valid
392 nmoperator
.eq(self
.n
.data_o
, data_o
), # update out
394 # buffer flush conditions (NOTE: can override data passthru conditions)
395 with self
.m
.If(nir_por_n
): # not stalled
396 # Flush the [already processed] buffer to the output port.
397 data_o
= self
._postprocess
(r_data
) # XXX TBD, does nothing right now
398 self
.m
.d
.sync
+= [self
.n
.valid_o
.eq(1), # reg empty
399 nmoperator
.eq(self
.n
.data_o
, data_o
), # flush
401 # output ready conditions
402 self
.m
.d
.sync
+= self
.p
._ready
_o
.eq(nir_novn | por_pivn
)
407 class MaskNoDelayCancellable(ControlBase
):
408 """ Mask-activated Cancellable pipeline (that does not respect "ready")
410 Based on (identical behaviour to) SimpleHandshake.
411 TODO: decide whether to merge *into* SimpleHandshake.
413 Argument: stage. see Stage API above
415 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
416 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
417 stage-1 p.data_i >>in stage n.data_o out>> stage+1
421 def __init__(self
, stage
, maskwid
, in_multi
=None, stage_ctl
=False):
422 ControlBase
.__init
__(self
, stage
, in_multi
, stage_ctl
, maskwid
)
424 def elaborate(self
, platform
):
425 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
427 # store result of processing in combinatorial temporary
428 result
= _spec(self
.stage
.ospec
, "r_tmp")
429 m
.d
.comb
+= nmoperator
.eq(result
, self
.data_r
)
431 # establish if the data should be passed on. cancellation is
433 # XXX EXCEPTIONAL CIRCUMSTANCES: inspection of the data payload
434 # is NOT "normal" for the Stage API.
435 p_valid_i
= Signal(reset_less
=True)
436 #print ("self.p.data_i", self.p.data_i)
437 maskedout
= Signal(len(self
.p
.mask_i
), reset_less
=True)
438 m
.d
.comb
+= maskedout
.eq(self
.p
.mask_i
& ~self
.p
.stop_i
)
439 m
.d
.comb
+= p_valid_i
.eq(maskedout
.bool())
441 # if idmask nonzero, mask gets passed on (and register set).
442 # register is left as-is if idmask is zero, but out-mask is set to zero
443 # note however: only the *uncancelled* mask bits get passed on
444 m
.d
.sync
+= self
.n
.valid_o
.eq(p_valid_i
)
445 m
.d
.sync
+= self
.n
.mask_o
.eq(Mux(p_valid_i
, maskedout
, 0))
446 with m
.If(p_valid_i
):
447 data_o
= self
._postprocess
(result
) # XXX TBD, does nothing right now
448 m
.d
.sync
+= nmoperator
.eq(self
.n
.data_o
, data_o
) # update output
451 # input always "ready"
452 #m.d.comb += self.p._ready_o.eq(self.n.ready_i_test)
453 m
.d
.comb
+= self
.p
._ready
_o
.eq(Const(1))
455 # always pass on stop (as combinatorial: single signal)
456 m
.d
.comb
+= self
.n
.stop_o
.eq(self
.p
.stop_i
)
461 class MaskCancellable(ControlBase
):
462 """ Mask-activated Cancellable pipeline
466 * stage. see Stage API above
467 * maskwid - sets up cancellation capability (mask and stop).
470 * dynamic - allows switching from sync to combinatorial (passthrough)
471 USE WITH CARE. will need the entire pipe to be quiescent
472 before switching, otherwise data WILL be destroyed.
474 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
475 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
476 stage-1 p.data_i >>in stage n.data_o out>> stage+1
480 def __init__(self
, stage
, maskwid
, in_multi
=None, stage_ctl
=False,
482 ControlBase
.__init
__(self
, stage
, in_multi
, stage_ctl
, maskwid
)
483 self
.dynamic
= dynamic
485 self
.latchmode
= Signal()
487 self
.latchmode
= Const(1)
489 def elaborate(self
, platform
):
490 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
492 r_mask
= Signal(len(self
.p
.mask_i
), reset_less
=True)
493 data_r
= _spec(self
.stage
.ospec
, "data_r")
494 m
.d
.comb
+= nmoperator
.eq(data_r
, self
._postprocess
(self
.data_r
))
496 with m
.If(self
.latchmode
):
498 r_latch
= _spec(self
.stage
.ospec
, "r_latch")
500 # establish if the data should be passed on. cancellation is
502 p_valid_i
= Signal(reset_less
=True)
503 #print ("self.p.data_i", self.p.data_i)
504 maskedout
= Signal(len(self
.p
.mask_i
), reset_less
=True)
505 m
.d
.comb
+= maskedout
.eq(self
.p
.mask_i
& ~self
.p
.stop_i
)
507 # establish some combinatorial temporaries
508 n_ready_i
= Signal(reset_less
=True, name
="n_i_rdy_data")
509 p_valid_i_p_ready_o
= Signal(reset_less
=True)
510 m
.d
.comb
+= [p_valid_i
.eq(self
.p
.valid_i_test
& maskedout
.bool()),
511 n_ready_i
.eq(self
.n
.ready_i_test
),
512 p_valid_i_p_ready_o
.eq(p_valid_i
& self
.p
.ready_o
),
515 # if idmask nonzero, mask gets passed on (and register set).
516 # register is left as-is if idmask is zero, but out-mask is set to
518 # note however: only the *uncancelled* mask bits get passed on
519 m
.d
.sync
+= r_mask
.eq(Mux(p_valid_i
, maskedout
, 0))
520 m
.d
.comb
+= self
.n
.mask_o
.eq(r_mask
)
522 # always pass on stop (as combinatorial: single signal)
523 m
.d
.comb
+= self
.n
.stop_o
.eq(self
.p
.stop_i
)
525 stor
= Signal(reset_less
=True)
526 m
.d
.comb
+= stor
.eq(p_valid_i_p_ready_o | n_ready_i
)
528 # store result of processing in combinatorial temporary
529 m
.d
.sync
+= nmoperator
.eq(r_latch
, data_r
)
531 # previous valid and ready
532 with m
.If(p_valid_i_p_ready_o
):
533 m
.d
.sync
+= r_busy
.eq(1) # output valid
534 # previous invalid or not ready, however next is accepting
535 with m
.Elif(n_ready_i
):
536 m
.d
.sync
+= r_busy
.eq(0) # ...so set output invalid
538 # output set combinatorially from latch
539 m
.d
.comb
+= nmoperator
.eq(self
.n
.data_o
, r_latch
)
541 m
.d
.comb
+= self
.n
.valid_o
.eq(r_busy
)
542 # if next is ready, so is previous
543 m
.d
.comb
+= self
.p
._ready
_o
.eq(n_ready_i
)
546 # pass everything straight through. p connected to n: data,
547 # valid, mask, everything. this is "effectively" just a
548 # StageChain (except now dynamically selectable)
549 m
.d
.comb
+= self
.n
.valid_o
.eq(self
.p
.valid_i_test
)
550 m
.d
.comb
+= self
.p
._ready
_o
.eq(self
.n
.ready_i_test
)
551 m
.d
.comb
+= self
.n
.stop_o
.eq(self
.p
.stop_i
)
552 m
.d
.comb
+= self
.n
.mask_o
.eq(self
.p
.mask_i
)
553 m
.d
.comb
+= nmoperator
.eq(self
.n
.data_o
, data_r
)
558 class SimpleHandshake(ControlBase
):
559 """ simple handshake control. data and strobe signals travel in sync.
560 implements the protocol used by Wishbone and AXI4.
562 Argument: stage. see Stage API above
564 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
565 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
566 stage-1 p.data_i >>in stage n.data_o out>> stage+1
571 Inputs Temporary Output Data
572 ------- ---------- ----- ----
573 P P N N PiV& ~NiR& N P
580 0 0 1 0 0 0 0 1 process(data_i)
581 0 0 1 1 0 0 0 1 process(data_i)
585 0 1 1 0 0 0 0 1 process(data_i)
586 0 1 1 1 0 0 0 1 process(data_i)
590 1 0 1 0 0 0 0 1 process(data_i)
591 1 0 1 1 0 0 0 1 process(data_i)
593 1 1 0 0 1 0 1 0 process(data_i)
594 1 1 0 1 1 1 1 0 process(data_i)
595 1 1 1 0 1 0 1 1 process(data_i)
596 1 1 1 1 1 0 1 1 process(data_i)
600 def elaborate(self
, platform
):
601 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
604 result
= _spec(self
.stage
.ospec
, "r_tmp")
606 # establish some combinatorial temporaries
607 n_ready_i
= Signal(reset_less
=True, name
="n_i_rdy_data")
608 p_valid_i_p_ready_o
= Signal(reset_less
=True)
609 p_valid_i
= Signal(reset_less
=True)
610 m
.d
.comb
+= [p_valid_i
.eq(self
.p
.valid_i_test
),
611 n_ready_i
.eq(self
.n
.ready_i_test
),
612 p_valid_i_p_ready_o
.eq(p_valid_i
& self
.p
.ready_o
),
615 # store result of processing in combinatorial temporary
616 m
.d
.comb
+= nmoperator
.eq(result
, self
.data_r
)
618 # previous valid and ready
619 with m
.If(p_valid_i_p_ready_o
):
620 data_o
= self
._postprocess
(result
) # XXX TBD, does nothing right now
621 m
.d
.sync
+= [r_busy
.eq(1), # output valid
622 nmoperator
.eq(self
.n
.data_o
, data_o
), # update output
624 # previous invalid or not ready, however next is accepting
625 with m
.Elif(n_ready_i
):
626 data_o
= self
._postprocess
(result
) # XXX TBD, does nothing right now
627 m
.d
.sync
+= [nmoperator
.eq(self
.n
.data_o
, data_o
)]
628 # TODO: could still send data here (if there was any)
629 #m.d.sync += self.n.valid_o.eq(0) # ...so set output invalid
630 m
.d
.sync
+= r_busy
.eq(0) # ...so set output invalid
632 m
.d
.comb
+= self
.n
.valid_o
.eq(r_busy
)
633 # if next is ready, so is previous
634 m
.d
.comb
+= self
.p
._ready
_o
.eq(n_ready_i
)
639 class UnbufferedPipeline(ControlBase
):
640 """ A simple pipeline stage with single-clock synchronisation
641 and two-way valid/ready synchronised signalling.
643 Note that a stall in one stage will result in the entire pipeline
646 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
647 travel synchronously with the data: the valid/ready signalling
648 combines in a *combinatorial* fashion. Therefore, a long pipeline
649 chain will lengthen propagation delays.
651 Argument: stage. see Stage API, above
653 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
654 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
655 stage-1 p.data_i >>in stage n.data_o out>> stage+1
663 p.data_i : StageInput, shaped according to ispec
665 p.data_o : StageOutput, shaped according to ospec
667 r_data : input_shape according to ispec
668 A temporary (buffered) copy of a prior (valid) input.
669 This is HELD if the output is not ready. It is updated
671 result: output_shape according to ospec
672 The output of the combinatorial logic. it is updated
673 COMBINATORIALLY (no clock dependence).
677 Inputs Temp Output Data
699 1 1 0 0 0 1 1 process(data_i)
700 1 1 0 1 1 1 0 process(data_i)
701 1 1 1 0 0 1 1 process(data_i)
702 1 1 1 1 0 1 1 process(data_i)
705 Note: PoR is *NOT* involved in the above decision-making.
708 def elaborate(self
, platform
):
709 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
711 data_valid
= Signal() # is data valid or not
712 r_data
= _spec(self
.stage
.ospec
, "r_tmp") # output type
715 p_valid_i
= Signal(reset_less
=True)
716 pv
= Signal(reset_less
=True)
717 buf_full
= Signal(reset_less
=True)
718 m
.d
.comb
+= p_valid_i
.eq(self
.p
.valid_i_test
)
719 m
.d
.comb
+= pv
.eq(self
.p
.valid_i
& self
.p
.ready_o
)
720 m
.d
.comb
+= buf_full
.eq(~self
.n
.ready_i_test
& data_valid
)
722 m
.d
.comb
+= self
.n
.valid_o
.eq(data_valid
)
723 m
.d
.comb
+= self
.p
._ready
_o
.eq(~data_valid | self
.n
.ready_i_test
)
724 m
.d
.sync
+= data_valid
.eq(p_valid_i | buf_full
)
727 m
.d
.sync
+= nmoperator
.eq(r_data
, self
.data_r
)
728 data_o
= self
._postprocess
(r_data
) # XXX TBD, does nothing right now
729 m
.d
.comb
+= nmoperator
.eq(self
.n
.data_o
, data_o
)
734 class UnbufferedPipeline2(ControlBase
):
735 """ A simple pipeline stage with single-clock synchronisation
736 and two-way valid/ready synchronised signalling.
738 Note that a stall in one stage will result in the entire pipeline
741 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
742 travel synchronously with the data: the valid/ready signalling
743 combines in a *combinatorial* fashion. Therefore, a long pipeline
744 chain will lengthen propagation delays.
746 Argument: stage. see Stage API, above
748 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
749 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
750 stage-1 p.data_i >>in stage n.data_o out>> stage+1
755 p.data_i : StageInput, shaped according to ispec
757 p.data_o : StageOutput, shaped according to ospec
759 buf : output_shape according to ospec
760 A temporary (buffered) copy of a valid output
761 This is HELD if the output is not ready. It is updated
764 Inputs Temp Output Data
766 P P N N ~NiR& N P (buf_full)
771 0 0 0 0 0 0 1 process(data_i)
772 0 0 0 1 1 1 0 reg (odata, unchanged)
773 0 0 1 0 0 0 1 process(data_i)
774 0 0 1 1 0 0 1 process(data_i)
776 0 1 0 0 0 0 1 process(data_i)
777 0 1 0 1 1 1 0 reg (odata, unchanged)
778 0 1 1 0 0 0 1 process(data_i)
779 0 1 1 1 0 0 1 process(data_i)
781 1 0 0 0 0 1 1 process(data_i)
782 1 0 0 1 1 1 0 reg (odata, unchanged)
783 1 0 1 0 0 1 1 process(data_i)
784 1 0 1 1 0 1 1 process(data_i)
786 1 1 0 0 0 1 1 process(data_i)
787 1 1 0 1 1 1 0 reg (odata, unchanged)
788 1 1 1 0 0 1 1 process(data_i)
789 1 1 1 1 0 1 1 process(data_i)
792 Note: PoR is *NOT* involved in the above decision-making.
795 def elaborate(self
, platform
):
796 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
798 buf_full
= Signal() # is data valid or not
799 buf
= _spec(self
.stage
.ospec
, "r_tmp") # output type
802 p_valid_i
= Signal(reset_less
=True)
803 m
.d
.comb
+= p_valid_i
.eq(self
.p
.valid_i_test
)
805 m
.d
.comb
+= self
.n
.valid_o
.eq(buf_full | p_valid_i
)
806 m
.d
.comb
+= self
.p
._ready
_o
.eq(~buf_full
)
807 m
.d
.sync
+= buf_full
.eq(~self
.n
.ready_i_test
& self
.n
.valid_o
)
809 data_o
= Mux(buf_full
, buf
, self
.data_r
)
810 data_o
= self
._postprocess
(data_o
) # XXX TBD, does nothing right now
811 m
.d
.comb
+= nmoperator
.eq(self
.n
.data_o
, data_o
)
812 m
.d
.sync
+= nmoperator
.eq(buf
, self
.n
.data_o
)
817 class PassThroughHandshake(ControlBase
):
818 """ A control block that delays by one clock cycle.
820 Inputs Temporary Output Data
821 ------- ------------------ ----- ----
822 P P N N PiV& PiV| NiR| pvr N P (pvr)
823 i o i o PoR ~PoR ~NoV o o
827 0 0 0 0 0 1 1 0 1 1 odata (unchanged)
828 0 0 0 1 0 1 0 0 1 0 odata (unchanged)
829 0 0 1 0 0 1 1 0 1 1 odata (unchanged)
830 0 0 1 1 0 1 1 0 1 1 odata (unchanged)
832 0 1 0 0 0 0 1 0 0 1 odata (unchanged)
833 0 1 0 1 0 0 0 0 0 0 odata (unchanged)
834 0 1 1 0 0 0 1 0 0 1 odata (unchanged)
835 0 1 1 1 0 0 1 0 0 1 odata (unchanged)
837 1 0 0 0 0 1 1 1 1 1 process(in)
838 1 0 0 1 0 1 0 0 1 0 odata (unchanged)
839 1 0 1 0 0 1 1 1 1 1 process(in)
840 1 0 1 1 0 1 1 1 1 1 process(in)
842 1 1 0 0 1 1 1 1 1 1 process(in)
843 1 1 0 1 1 1 0 0 1 0 odata (unchanged)
844 1 1 1 0 1 1 1 1 1 1 process(in)
845 1 1 1 1 1 1 1 1 1 1 process(in)
850 def elaborate(self
, platform
):
851 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
853 r_data
= _spec(self
.stage
.ospec
, "r_tmp") # output type
856 p_valid_i
= Signal(reset_less
=True)
857 pvr
= Signal(reset_less
=True)
858 m
.d
.comb
+= p_valid_i
.eq(self
.p
.valid_i_test
)
859 m
.d
.comb
+= pvr
.eq(p_valid_i
& self
.p
.ready_o
)
861 m
.d
.comb
+= self
.p
.ready_o
.eq(~self
.n
.valid_o | self
.n
.ready_i_test
)
862 m
.d
.sync
+= self
.n
.valid_o
.eq(p_valid_i | ~self
.p
.ready_o
)
864 odata
= Mux(pvr
, self
.data_r
, r_data
)
865 m
.d
.sync
+= nmoperator
.eq(r_data
, odata
)
866 r_data
= self
._postprocess
(r_data
) # XXX TBD, does nothing right now
867 m
.d
.comb
+= nmoperator
.eq(self
.n
.data_o
, r_data
)
872 class RegisterPipeline(UnbufferedPipeline
):
873 """ A pipeline stage that delays by one clock cycle, creating a
874 sync'd latch out of data_o and valid_o as an indirect byproduct
875 of using PassThroughStage
877 def __init__(self
, iospecfn
):
878 UnbufferedPipeline
.__init
__(self
, PassThroughStage(iospecfn
))
881 class FIFOControl(ControlBase
):
882 """ FIFO Control. Uses Queue to store data, coincidentally
883 happens to have same valid/ready signalling as Stage API.
885 data_i -> fifo.din -> FIFO -> fifo.dout -> data_o
887 def __init__(self
, depth
, stage
, in_multi
=None, stage_ctl
=False,
888 fwft
=True, pipe
=False):
891 * :depth: number of entries in the FIFO
892 * :stage: data processing block
893 * :fwft: first word fall-thru mode (non-fwft introduces delay)
894 * :pipe: specifies pipe mode.
896 when fwft = True it indicates that transfers may occur
897 combinatorially through stage processing in the same clock cycle.
898 This requires that the Stage be a Moore FSM:
899 https://en.wikipedia.org/wiki/Moore_machine
901 when fwft = False it indicates that all output signals are
902 produced only from internal registers or memory, i.e. that the
903 Stage is a Mealy FSM:
904 https://en.wikipedia.org/wiki/Mealy_machine
906 data is processed (and located) as follows:
908 self.p self.stage temp fn temp fn temp fp self.n
909 data_i->process()->result->cat->din.FIFO.dout->cat(data_o)
911 yes, really: cat produces a Cat() which can be assigned to.
912 this is how the FIFO gets de-catted without needing a de-cat
918 ControlBase
.__init
__(self
, stage
, in_multi
, stage_ctl
)
920 def elaborate(self
, platform
):
921 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
923 # make a FIFO with a signal of equal width to the data_o.
924 (fwidth
, _
) = nmoperator
.shape(self
.n
.data_o
)
925 fifo
= Queue(fwidth
, self
.fdepth
, fwft
=self
.fwft
, pipe
=self
.pipe
)
926 m
.submodules
.fifo
= fifo
928 def processfn(data_i
):
929 # store result of processing in combinatorial temporary
930 result
= _spec(self
.stage
.ospec
, "r_temp")
931 m
.d
.comb
+= nmoperator
.eq(result
, self
.process(data_i
))
932 return nmoperator
.cat(result
)
934 ## prev: make the FIFO (Queue object) "look" like a PrevControl...
935 m
.submodules
.fp
= fp
= PrevControl()
936 fp
.valid_i
, fp
._ready
_o
, fp
.data_i
= fifo
.we
, fifo
.writable
, fifo
.din
937 m
.d
.comb
+= fp
._connect
_in
(self
.p
, fn
=processfn
)
939 # next: make the FIFO (Queue object) "look" like a NextControl...
940 m
.submodules
.fn
= fn
= NextControl()
941 fn
.valid_o
, fn
.ready_i
, fn
.data_o
= fifo
.readable
, fifo
.re
, fifo
.dout
942 connections
= fn
._connect
_out
(self
.n
, fn
=nmoperator
.cat
)
944 # ok ok so we can't just do the ready/valid eqs straight:
945 # first 2 from connections are the ready/valid, 3rd is data.
947 m
.d
.comb
+= connections
[:2] # combinatorial on next ready/valid
949 m
.d
.sync
+= connections
[:2] # non-fwft mode needs sync
950 data_o
= connections
[2] # get the data
951 data_o
= self
._postprocess
(data_o
) # XXX TBD, does nothing right now
958 class UnbufferedPipeline(FIFOControl
):
959 def __init__(self
, stage
, in_multi
=None, stage_ctl
=False):
960 FIFOControl
.__init
__(self
, 1, stage
, in_multi
, stage_ctl
,
961 fwft
=True, pipe
=False)
963 # aka "BreakReadyStage" XXX had to set fwft=True to get it to work
964 class PassThroughHandshake(FIFOControl
):
965 def __init__(self
, stage
, in_multi
=None, stage_ctl
=False):
966 FIFOControl
.__init
__(self
, 1, stage
, in_multi
, stage_ctl
,
967 fwft
=True, pipe
=True)
969 # this is *probably* BufferedHandshake, although test #997 now succeeds.
970 class BufferedHandshake(FIFOControl
):
971 def __init__(self
, stage
, in_multi
=None, stage_ctl
=False):
972 FIFOControl
.__init
__(self
, 2, stage
, in_multi
, stage_ctl
,
973 fwft
=True, pipe
=False)
977 # this is *probably* SimpleHandshake (note: memory cell size=0)
978 class SimpleHandshake(FIFOControl):
979 def __init__(self, stage, in_multi=None, stage_ctl=False):
980 FIFOControl.__init__(self, 0, stage, in_multi, stage_ctl,
981 fwft=True, pipe=False)