1 """ Pipeline API. For multi-input and multi-output variants, see multipipe.
3 Associated development bugs:
4 * http://bugs.libre-riscv.org/show_bug.cgi?id=148
5 * http://bugs.libre-riscv.org/show_bug.cgi?id=64
6 * http://bugs.libre-riscv.org/show_bug.cgi?id=57
8 Important: see Stage API (stageapi.py) and IO Control API
9 (iocontrol.py) in combination with below. This module
10 "combines" the Stage API with the IO Control API to create
13 The one critically important key difference between StageAPI and
16 * StageAPI: combinatorial (NO REGISTERS / LATCHES PERMITTED)
17 * PipelineAPI: synchronous registers / latches get added here
22 A convenience class that takes an input shape, output shape, a
23 "processing" function and an optional "setup" function. Honestly
24 though, there's not much more effort to just... create a class
25 that returns a couple of Records (see ExampleAddRecordStage in
31 A convenience class that takes a single function as a parameter,
32 that is chain-called to create the exact same input and output spec.
33 It has a process() function that simply returns its input.
35 Instances of this class are completely redundant if handed to
36 StageChain, however when passed to UnbufferedPipeline they
37 can be used to introduce a single clock delay.
42 The base class for pipelines. Contains previous and next ready/valid/data.
43 Also has an extremely useful "connect" function that can be used to
44 connect a chain of pipelines and present the exact same prev/next
47 Note: pipelines basically do not become pipelines as such until
48 handed to a derivative of ControlBase. ControlBase itself is *not*
49 strictly considered a pipeline class. Wishbone and AXI4 (master or
50 slave) could be derived from ControlBase, for example.
54 A simple stalling clock-synchronised pipeline that has no buffering
55 (unlike BufferedHandshake). Data flows on *every* clock cycle when
56 the conditions are right (this is nominally when the input is valid
57 and the output is ready).
59 A stall anywhere along the line will result in a stall back-propagating
60 down the entire chain. The BufferedHandshake by contrast will buffer
61 incoming data, allowing previous stages one clock cycle's grace before
64 An advantage of the UnbufferedPipeline over the Buffered one is
65 that the amount of logic needed (number of gates) is greatly
66 reduced (no second set of buffers basically)
68 The disadvantage of the UnbufferedPipeline is that the valid/ready
69 logic, if chained together, is *combinatorial*, resulting in
70 progressively larger gate delay.
75 A Control class that introduces a single clock delay, passing its
76 data through unaltered. Unlike RegisterPipeline (which relies
77 on UnbufferedPipeline and PassThroughStage) it handles ready/valid
83 A convenience class that, because UnbufferedPipeline introduces a single
84 clock delay, when its stage is a PassThroughStage, it results in a Pipeline
85 stage that, duh, delays its (unmodified) input by one clock cycle.
90 nmigen implementation of buffered pipeline stage, based on zipcpu:
91 https://zipcpu.com/blog/2017/08/14/strategies-for-pipelining.html
93 this module requires quite a bit of thought to understand how it works
94 (and why it is needed in the first place). reading the above is
95 *strongly* recommended.
97 unlike john dawson's IEEE754 FPU STB/ACK signalling, which requires
98 the STB / ACK signals to raise and lower (on separate clocks) before
99 data may proceeed (thus only allowing one piece of data to proceed
100 on *ALTERNATE* cycles), the signalling here is a true pipeline
101 where data will flow on *every* clock when the conditions are right.
103 input acceptance conditions are when:
104 * incoming previous-stage strobe (p.valid_i) is HIGH
105 * outgoing previous-stage ready (p.ready_o) is LOW
107 output transmission conditions are when:
108 * outgoing next-stage strobe (n.valid_o) is HIGH
109 * outgoing next-stage ready (n.ready_i) is LOW
111 the tricky bit is when the input has valid data and the output is not
112 ready to accept it. if it wasn't for the clock synchronisation, it
113 would be possible to tell the input "hey don't send that data, we're
114 not ready". unfortunately, it's not possible to "change the past":
115 the previous stage *has no choice* but to pass on its data.
117 therefore, the incoming data *must* be accepted - and stored: that
118 is the responsibility / contract that this stage *must* accept.
119 on the same clock, it's possible to tell the input that it must
120 not send any more data. this is the "stall" condition.
122 we now effectively have *two* possible pieces of data to "choose" from:
123 the buffered data, and the incoming data. the decision as to which
124 to process and output is based on whether we are in "stall" or not.
125 i.e. when the next stage is no longer ready, the output comes from
126 the buffer if a stall had previously occurred, otherwise it comes
127 direct from processing the input.
129 this allows us to respect a synchronous "travelling STB" with what
130 dan calls a "buffered handshake".
132 it's quite a complex state machine!
137 Synchronised pipeline, Based on:
138 https://github.com/ZipCPU/dbgbus/blob/master/hexbus/rtl/hbdeword.v
141 from nmigen
import Signal
, Mux
, Module
, Elaboratable
, Const
142 from nmigen
.cli
import verilog
, rtlil
143 from nmigen
.hdl
.rec
import Record
145 from nmutil
.queue
import Queue
148 from nmutil
.iocontrol
import (PrevControl
, NextControl
, Object
, RecordObject
)
149 from nmutil
.stageapi
import (_spec
, StageCls
, Stage
, StageChain
, StageHelper
)
150 from nmutil
import nmoperator
153 class RecordBasedStage(Stage
):
154 """ convenience class which provides a Records-based layout.
155 honestly it's a lot easier just to create a direct Records-based
156 class (see ExampleAddRecordStage)
158 def __init__(self
, in_shape
, out_shape
, processfn
, setupfn
=None):
159 self
.in_shape
= in_shape
160 self
.out_shape
= out_shape
161 self
.__process
= processfn
162 self
.__setup
= setupfn
163 def ispec(self
): return Record(self
.in_shape
)
164 def ospec(self
): return Record(self
.out_shape
)
165 def process(seif
, i
): return self
.__process
(i
)
166 def setup(seif
, m
, i
): return self
.__setup
(m
, i
)
169 class PassThroughStage(StageCls
):
170 """ a pass-through stage with its input data spec identical to its output,
171 and "passes through" its data from input to output (does nothing).
173 use this basically to explicitly make any data spec Stage-compliant.
174 (many APIs would potentially use a static "wrap" method in e.g.
175 StageCls to achieve a similar effect)
177 def __init__(self
, iospecfn
): self
.iospecfn
= iospecfn
178 def ispec(self
): return self
.iospecfn()
179 def ospec(self
): return self
.iospecfn()
182 class ControlBase(StageHelper
, Elaboratable
):
183 """ Common functions for Pipeline API. Note: a "pipeline stage" only
184 exists (conceptually) when a ControlBase derivative is handed
185 a Stage (combinatorial block)
187 NOTE: ControlBase derives from StageHelper, making it accidentally
188 compliant with the Stage API. Using those functions directly
189 *BYPASSES* a ControlBase instance ready/valid signalling, which
190 clearly should not be done without a really, really good reason.
192 def __init__(self
, stage
=None, in_multi
=None, stage_ctl
=False, maskwid
=0):
193 """ Base class containing ready/valid/data to previous and next stages
195 * p: contains ready/valid to the previous stage
196 * n: contains ready/valid to the next stage
198 Except when calling Controlbase.connect(), user must also:
199 * add data_i member to PrevControl (p) and
200 * add data_o member to NextControl (n)
201 Calling ControlBase._new_data is a good way to do that.
203 print ("ControlBase", self
, stage
, in_multi
, stage_ctl
)
204 StageHelper
.__init
__(self
, stage
)
206 # set up input and output IO ACK (prev/next ready/valid)
207 self
.p
= PrevControl(in_multi
, stage_ctl
, maskwid
=maskwid
)
208 self
.n
= NextControl(stage_ctl
, maskwid
=maskwid
)
210 # set up the input and output data
211 if stage
is not None:
212 self
._new
_data
("data")
214 def _new_data(self
, name
):
215 """ allocates new data_i and data_o
217 self
.p
.data_i
, self
.n
.data_o
= self
.new_specs(name
)
221 return self
.process(self
.p
.data_i
)
223 def connect_to_next(self
, nxt
):
224 """ helper function to connect to the next stage data/valid/ready.
226 return self
.n
.connect_to_next(nxt
.p
)
228 def _connect_in(self
, prev
):
229 """ internal helper function to connect stage to an input source.
230 do not use to connect stage-to-stage!
232 return self
.p
._connect
_in
(prev
.p
)
234 def _connect_out(self
, nxt
):
235 """ internal helper function to connect stage to an output source.
236 do not use to connect stage-to-stage!
238 return self
.n
._connect
_out
(nxt
.n
)
240 def connect(self
, pipechain
):
241 """ connects a chain (list) of Pipeline instances together and
242 links them to this ControlBase instance:
244 in <----> self <---> out
247 [pipe1, pipe2, pipe3, pipe4]
250 out---in out--in out---in
252 Also takes care of allocating data_i/data_o, by looking up
253 the data spec for each end of the pipechain. i.e It is NOT
254 necessary to allocate self.p.data_i or self.n.data_o manually:
255 this is handled AUTOMATICALLY, here.
257 Basically this function is the direct equivalent of StageChain,
258 except that unlike StageChain, the Pipeline logic is followed.
260 Just as StageChain presents an object that conforms to the
261 Stage API from a list of objects that also conform to the
262 Stage API, an object that calls this Pipeline connect function
263 has the exact same pipeline API as the list of pipline objects
266 Thus it becomes possible to build up larger chains recursively.
267 More complex chains (multi-input, multi-output) will have to be
272 * :pipechain: - a sequence of ControlBase-derived classes
273 (must be one or more in length)
277 * a list of eq assignments that will need to be added in
278 an elaborate() to m.d.comb
280 assert len(pipechain
) > 0, "pipechain must be non-zero length"
281 assert self
.stage
is None, "do not use connect with a stage"
282 eqs
= [] # collated list of assignment statements
284 # connect inter-chain
285 for i
in range(len(pipechain
)-1):
286 pipe1
= pipechain
[i
] # earlier
287 pipe2
= pipechain
[i
+1] # later (by 1)
288 eqs
+= pipe1
.connect_to_next(pipe2
) # earlier n to later p
290 # connect front and back of chain to ourselves
291 front
= pipechain
[0] # first in chain
292 end
= pipechain
[-1] # last in chain
293 self
.set_specs(front
, end
) # sets up ispec/ospec functions
294 self
._new
_data
("chain") # NOTE: REPLACES existing data
295 eqs
+= front
._connect
_in
(self
) # front p to our p
296 eqs
+= end
._connect
_out
(self
) # end n to our n
300 def set_input(self
, i
):
301 """ helper function to set the input data (used in unit tests)
303 return nmoperator
.eq(self
.p
.data_i
, i
)
306 yield from self
.p
# yields ready/valid/data (data also gets yielded)
307 yield from self
.n
# ditto
312 def elaborate(self
, platform
):
313 """ handles case where stage has dynamic ready/valid functions
316 m
.submodules
.p
= self
.p
317 m
.submodules
.n
= self
.n
319 self
.setup(m
, self
.p
.data_i
)
321 if not self
.p
.stage_ctl
:
324 # intercept the previous (outgoing) "ready", combine with stage ready
325 m
.d
.comb
+= self
.p
.s_ready_o
.eq(self
.p
._ready
_o
& self
.stage
.d_ready
)
327 # intercept the next (incoming) "ready" and combine it with data valid
328 sdv
= self
.stage
.d_valid(self
.n
.ready_i
)
329 m
.d
.comb
+= self
.n
.d_valid
.eq(self
.n
.ready_i
& sdv
)
334 class BufferedHandshake(ControlBase
):
335 """ buffered pipeline stage. data and strobe signals travel in sync.
336 if ever the input is ready and the output is not, processed data
337 is shunted in a temporary register.
339 Argument: stage. see Stage API above
341 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
342 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
343 stage-1 p.data_i >>in stage n.data_o out>> stage+1
349 input data p.data_i is read (only), is processed and goes into an
350 intermediate result store [process()]. this is updated combinatorially.
352 in a non-stall condition, the intermediate result will go into the
353 output (update_output). however if ever there is a stall, it goes
354 into r_data instead [update_buffer()].
356 when the non-stall condition is released, r_data is the first
357 to be transferred to the output [flush_buffer()], and the stall
360 on the next cycle (as long as stall is not raised again) the
361 input may begin to be processed and transferred directly to output.
364 def elaborate(self
, platform
):
365 self
.m
= ControlBase
.elaborate(self
, platform
)
367 result
= _spec(self
.stage
.ospec
, "r_tmp")
368 r_data
= _spec(self
.stage
.ospec
, "r_data")
370 # establish some combinatorial temporaries
371 o_n_validn
= Signal(reset_less
=True)
372 n_ready_i
= Signal(reset_less
=True, name
="n_i_rdy_data")
373 nir_por
= Signal(reset_less
=True)
374 nir_por_n
= Signal(reset_less
=True)
375 p_valid_i
= Signal(reset_less
=True)
376 nir_novn
= Signal(reset_less
=True)
377 nirn_novn
= Signal(reset_less
=True)
378 por_pivn
= Signal(reset_less
=True)
379 npnn
= Signal(reset_less
=True)
380 self
.m
.d
.comb
+= [p_valid_i
.eq(self
.p
.valid_i_test
),
381 o_n_validn
.eq(~self
.n
.valid_o
),
382 n_ready_i
.eq(self
.n
.ready_i_test
),
383 nir_por
.eq(n_ready_i
& self
.p
._ready
_o
),
384 nir_por_n
.eq(n_ready_i
& ~self
.p
._ready
_o
),
385 nir_novn
.eq(n_ready_i | o_n_validn
),
386 nirn_novn
.eq(~n_ready_i
& o_n_validn
),
387 npnn
.eq(nir_por | nirn_novn
),
388 por_pivn
.eq(self
.p
._ready
_o
& ~p_valid_i
)
391 # store result of processing in combinatorial temporary
392 self
.m
.d
.comb
+= nmoperator
.eq(result
, self
.data_r
)
394 # if not in stall condition, update the temporary register
395 with self
.m
.If(self
.p
.ready_o
): # not stalled
396 self
.m
.d
.sync
+= nmoperator
.eq(r_data
, result
) # update buffer
398 # data pass-through conditions
399 with self
.m
.If(npnn
):
400 data_o
= self
._postprocess
(result
) # XXX TBD, does nothing right now
401 self
.m
.d
.sync
+= [self
.n
.valid_o
.eq(p_valid_i
), # valid if p_valid
402 nmoperator
.eq(self
.n
.data_o
, data_o
), # update out
404 # buffer flush conditions (NOTE: can override data passthru conditions)
405 with self
.m
.If(nir_por_n
): # not stalled
406 # Flush the [already processed] buffer to the output port.
407 data_o
= self
._postprocess
(r_data
) # XXX TBD, does nothing right now
408 self
.m
.d
.sync
+= [self
.n
.valid_o
.eq(1), # reg empty
409 nmoperator
.eq(self
.n
.data_o
, data_o
), # flush
411 # output ready conditions
412 self
.m
.d
.sync
+= self
.p
._ready
_o
.eq(nir_novn | por_pivn
)
417 class MaskNoDelayCancellable(ControlBase
):
418 """ Mask-activated Cancellable pipeline (that does not respect "ready")
420 Based on (identical behaviour to) SimpleHandshake.
421 TODO: decide whether to merge *into* SimpleHandshake.
423 Argument: stage. see Stage API above
425 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
426 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
427 stage-1 p.data_i >>in stage n.data_o out>> stage+1
431 def __init__(self
, stage
, maskwid
, in_multi
=None, stage_ctl
=False):
432 ControlBase
.__init
__(self
, stage
, in_multi
, stage_ctl
, maskwid
)
434 def elaborate(self
, platform
):
435 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
437 # store result of processing in combinatorial temporary
438 result
= _spec(self
.stage
.ospec
, "r_tmp")
439 m
.d
.comb
+= nmoperator
.eq(result
, self
.data_r
)
441 # establish if the data should be passed on. cancellation is
443 # XXX EXCEPTIONAL CIRCUMSTANCES: inspection of the data payload
444 # is NOT "normal" for the Stage API.
445 p_valid_i
= Signal(reset_less
=True)
446 #print ("self.p.data_i", self.p.data_i)
447 maskedout
= Signal(len(self
.p
.mask_i
), reset_less
=True)
448 m
.d
.comb
+= maskedout
.eq(self
.p
.mask_i
& ~self
.p
.stop_i
)
449 m
.d
.comb
+= p_valid_i
.eq(maskedout
.bool())
451 # if idmask nonzero, mask gets passed on (and register set).
452 # register is left as-is if idmask is zero, but out-mask is set to zero
453 # note however: only the *uncancelled* mask bits get passed on
454 m
.d
.sync
+= self
.n
.valid_o
.eq(p_valid_i
)
455 m
.d
.sync
+= self
.n
.mask_o
.eq(Mux(p_valid_i
, maskedout
, 0))
456 with m
.If(p_valid_i
):
457 data_o
= self
._postprocess
(result
) # XXX TBD, does nothing right now
458 m
.d
.sync
+= nmoperator
.eq(self
.n
.data_o
, data_o
) # update output
461 # input always "ready"
462 #m.d.comb += self.p._ready_o.eq(self.n.ready_i_test)
463 m
.d
.comb
+= self
.p
._ready
_o
.eq(Const(1))
465 # always pass on stop (as combinatorial: single signal)
466 m
.d
.comb
+= self
.n
.stop_o
.eq(self
.p
.stop_i
)
471 class MaskCancellable(ControlBase
):
472 """ Mask-activated Cancellable pipeline
476 * stage. see Stage API above
477 * maskwid - sets up cancellation capability (mask and stop).
480 * dynamic - allows switching from sync to combinatorial (passthrough)
481 USE WITH CARE. will need the entire pipe to be quiescent
482 before switching, otherwise data WILL be destroyed.
484 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
485 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
486 stage-1 p.data_i >>in stage n.data_o out>> stage+1
490 def __init__(self
, stage
, maskwid
, in_multi
=None, stage_ctl
=False,
492 ControlBase
.__init
__(self
, stage
, in_multi
, stage_ctl
, maskwid
)
493 self
.dynamic
= dynamic
495 self
.latchmode
= Signal()
497 self
.latchmode
= Const(1)
499 def elaborate(self
, platform
):
500 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
502 mask_r
= Signal(len(self
.p
.mask_i
), reset_less
=True)
503 data_r
= _spec(self
.stage
.ospec
, "data_r")
504 m
.d
.comb
+= nmoperator
.eq(data_r
, self
._postprocess
(self
.data_r
))
506 with m
.If(self
.latchmode
):
508 r_latch
= _spec(self
.stage
.ospec
, "r_latch")
510 # establish if the data should be passed on. cancellation is
512 p_valid_i
= Signal(reset_less
=True)
513 #print ("self.p.data_i", self.p.data_i)
514 maskedout
= Signal(len(self
.p
.mask_i
), reset_less
=True)
515 m
.d
.comb
+= maskedout
.eq(self
.p
.mask_i
& ~self
.p
.stop_i
)
517 # establish some combinatorial temporaries
518 n_ready_i
= Signal(reset_less
=True, name
="n_i_rdy_data")
519 p_valid_i_p_ready_o
= Signal(reset_less
=True)
520 m
.d
.comb
+= [p_valid_i
.eq(self
.p
.valid_i_test
& maskedout
.bool()),
521 n_ready_i
.eq(self
.n
.ready_i_test
),
522 p_valid_i_p_ready_o
.eq(p_valid_i
& self
.p
.ready_o
),
525 # if idmask nonzero, mask gets passed on (and register set).
526 # register is left as-is if idmask is zero, but out-mask is set to
528 # note however: only the *uncancelled* mask bits get passed on
529 m
.d
.sync
+= mask_r
.eq(Mux(p_valid_i
, maskedout
, 0))
530 m
.d
.comb
+= self
.n
.mask_o
.eq(mask_r
)
532 # always pass on stop (as combinatorial: single signal)
533 m
.d
.comb
+= self
.n
.stop_o
.eq(self
.p
.stop_i
)
535 stor
= Signal(reset_less
=True)
536 m
.d
.comb
+= stor
.eq(p_valid_i_p_ready_o | n_ready_i
)
538 # store result of processing in combinatorial temporary
539 m
.d
.sync
+= nmoperator
.eq(r_latch
, data_r
)
541 # previous valid and ready
542 with m
.If(p_valid_i_p_ready_o
):
543 m
.d
.sync
+= r_busy
.eq(1) # output valid
544 # previous invalid or not ready, however next is accepting
545 with m
.Elif(n_ready_i
):
546 m
.d
.sync
+= r_busy
.eq(0) # ...so set output invalid
548 # output set combinatorially from latch
549 m
.d
.comb
+= nmoperator
.eq(self
.n
.data_o
, r_latch
)
551 m
.d
.comb
+= self
.n
.valid_o
.eq(r_busy
)
552 # if next is ready, so is previous
553 m
.d
.comb
+= self
.p
._ready
_o
.eq(n_ready_i
)
556 # pass everything straight through. p connected to n: data,
557 # valid, mask, everything. this is "effectively" just a
558 # StageChain: MaskCancellable is doing "nothing" except
559 # combinatorially passing everything through
560 # (except now it's *dynamically selectable* whether to do that)
561 m
.d
.comb
+= self
.n
.valid_o
.eq(self
.p
.valid_i_test
)
562 m
.d
.comb
+= self
.p
._ready
_o
.eq(self
.n
.ready_i_test
)
563 m
.d
.comb
+= self
.n
.stop_o
.eq(self
.p
.stop_i
)
564 m
.d
.comb
+= self
.n
.mask_o
.eq(self
.p
.mask_i
)
565 m
.d
.comb
+= nmoperator
.eq(self
.n
.data_o
, data_r
)
570 class SimpleHandshake(ControlBase
):
571 """ simple handshake control. data and strobe signals travel in sync.
572 implements the protocol used by Wishbone and AXI4.
574 Argument: stage. see Stage API above
576 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
577 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
578 stage-1 p.data_i >>in stage n.data_o out>> stage+1
583 Inputs Temporary Output Data
584 ------- ---------- ----- ----
585 P P N N PiV& ~NiR& N P
592 0 0 1 0 0 0 0 1 process(data_i)
593 0 0 1 1 0 0 0 1 process(data_i)
597 0 1 1 0 0 0 0 1 process(data_i)
598 0 1 1 1 0 0 0 1 process(data_i)
602 1 0 1 0 0 0 0 1 process(data_i)
603 1 0 1 1 0 0 0 1 process(data_i)
605 1 1 0 0 1 0 1 0 process(data_i)
606 1 1 0 1 1 1 1 0 process(data_i)
607 1 1 1 0 1 0 1 1 process(data_i)
608 1 1 1 1 1 0 1 1 process(data_i)
612 def elaborate(self
, platform
):
613 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
616 result
= _spec(self
.stage
.ospec
, "r_tmp")
618 # establish some combinatorial temporaries
619 n_ready_i
= Signal(reset_less
=True, name
="n_i_rdy_data")
620 p_valid_i_p_ready_o
= Signal(reset_less
=True)
621 p_valid_i
= Signal(reset_less
=True)
622 m
.d
.comb
+= [p_valid_i
.eq(self
.p
.valid_i_test
),
623 n_ready_i
.eq(self
.n
.ready_i_test
),
624 p_valid_i_p_ready_o
.eq(p_valid_i
& self
.p
.ready_o
),
627 # store result of processing in combinatorial temporary
628 m
.d
.comb
+= nmoperator
.eq(result
, self
.data_r
)
630 # previous valid and ready
631 with m
.If(p_valid_i_p_ready_o
):
632 data_o
= self
._postprocess
(result
) # XXX TBD, does nothing right now
633 m
.d
.sync
+= [r_busy
.eq(1), # output valid
634 nmoperator
.eq(self
.n
.data_o
, data_o
), # update output
636 # previous invalid or not ready, however next is accepting
637 with m
.Elif(n_ready_i
):
638 data_o
= self
._postprocess
(result
) # XXX TBD, does nothing right now
639 m
.d
.sync
+= [nmoperator
.eq(self
.n
.data_o
, data_o
)]
640 # TODO: could still send data here (if there was any)
641 #m.d.sync += self.n.valid_o.eq(0) # ...so set output invalid
642 m
.d
.sync
+= r_busy
.eq(0) # ...so set output invalid
644 m
.d
.comb
+= self
.n
.valid_o
.eq(r_busy
)
645 # if next is ready, so is previous
646 m
.d
.comb
+= self
.p
._ready
_o
.eq(n_ready_i
)
651 class UnbufferedPipeline(ControlBase
):
652 """ A simple pipeline stage with single-clock synchronisation
653 and two-way valid/ready synchronised signalling.
655 Note that a stall in one stage will result in the entire pipeline
658 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
659 travel synchronously with the data: the valid/ready signalling
660 combines in a *combinatorial* fashion. Therefore, a long pipeline
661 chain will lengthen propagation delays.
663 Argument: stage. see Stage API, above
665 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
666 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
667 stage-1 p.data_i >>in stage n.data_o out>> stage+1
675 p.data_i : StageInput, shaped according to ispec
677 p.data_o : StageOutput, shaped according to ospec
679 r_data : input_shape according to ispec
680 A temporary (buffered) copy of a prior (valid) input.
681 This is HELD if the output is not ready. It is updated
683 result: output_shape according to ospec
684 The output of the combinatorial logic. it is updated
685 COMBINATORIALLY (no clock dependence).
689 Inputs Temp Output Data
711 1 1 0 0 0 1 1 process(data_i)
712 1 1 0 1 1 1 0 process(data_i)
713 1 1 1 0 0 1 1 process(data_i)
714 1 1 1 1 0 1 1 process(data_i)
717 Note: PoR is *NOT* involved in the above decision-making.
720 def elaborate(self
, platform
):
721 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
723 data_valid
= Signal() # is data valid or not
724 r_data
= _spec(self
.stage
.ospec
, "r_tmp") # output type
727 p_valid_i
= Signal(reset_less
=True)
728 pv
= Signal(reset_less
=True)
729 buf_full
= Signal(reset_less
=True)
730 m
.d
.comb
+= p_valid_i
.eq(self
.p
.valid_i_test
)
731 m
.d
.comb
+= pv
.eq(self
.p
.valid_i
& self
.p
.ready_o
)
732 m
.d
.comb
+= buf_full
.eq(~self
.n
.ready_i_test
& data_valid
)
734 m
.d
.comb
+= self
.n
.valid_o
.eq(data_valid
)
735 m
.d
.comb
+= self
.p
._ready
_o
.eq(~data_valid | self
.n
.ready_i_test
)
736 m
.d
.sync
+= data_valid
.eq(p_valid_i | buf_full
)
739 m
.d
.sync
+= nmoperator
.eq(r_data
, self
.data_r
)
740 data_o
= self
._postprocess
(r_data
) # XXX TBD, does nothing right now
741 m
.d
.comb
+= nmoperator
.eq(self
.n
.data_o
, data_o
)
746 class UnbufferedPipeline2(ControlBase
):
747 """ A simple pipeline stage with single-clock synchronisation
748 and two-way valid/ready synchronised signalling.
750 Note that a stall in one stage will result in the entire pipeline
753 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
754 travel synchronously with the data: the valid/ready signalling
755 combines in a *combinatorial* fashion. Therefore, a long pipeline
756 chain will lengthen propagation delays.
758 Argument: stage. see Stage API, above
760 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
761 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
762 stage-1 p.data_i >>in stage n.data_o out>> stage+1
767 p.data_i : StageInput, shaped according to ispec
769 p.data_o : StageOutput, shaped according to ospec
771 buf : output_shape according to ospec
772 A temporary (buffered) copy of a valid output
773 This is HELD if the output is not ready. It is updated
776 Inputs Temp Output Data
778 P P N N ~NiR& N P (buf_full)
783 0 0 0 0 0 0 1 process(data_i)
784 0 0 0 1 1 1 0 reg (odata, unchanged)
785 0 0 1 0 0 0 1 process(data_i)
786 0 0 1 1 0 0 1 process(data_i)
788 0 1 0 0 0 0 1 process(data_i)
789 0 1 0 1 1 1 0 reg (odata, unchanged)
790 0 1 1 0 0 0 1 process(data_i)
791 0 1 1 1 0 0 1 process(data_i)
793 1 0 0 0 0 1 1 process(data_i)
794 1 0 0 1 1 1 0 reg (odata, unchanged)
795 1 0 1 0 0 1 1 process(data_i)
796 1 0 1 1 0 1 1 process(data_i)
798 1 1 0 0 0 1 1 process(data_i)
799 1 1 0 1 1 1 0 reg (odata, unchanged)
800 1 1 1 0 0 1 1 process(data_i)
801 1 1 1 1 0 1 1 process(data_i)
804 Note: PoR is *NOT* involved in the above decision-making.
807 def elaborate(self
, platform
):
808 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
810 buf_full
= Signal() # is data valid or not
811 buf
= _spec(self
.stage
.ospec
, "r_tmp") # output type
814 p_valid_i
= Signal(reset_less
=True)
815 m
.d
.comb
+= p_valid_i
.eq(self
.p
.valid_i_test
)
817 m
.d
.comb
+= self
.n
.valid_o
.eq(buf_full | p_valid_i
)
818 m
.d
.comb
+= self
.p
._ready
_o
.eq(~buf_full
)
819 m
.d
.sync
+= buf_full
.eq(~self
.n
.ready_i_test
& self
.n
.valid_o
)
821 data_o
= Mux(buf_full
, buf
, self
.data_r
)
822 data_o
= self
._postprocess
(data_o
) # XXX TBD, does nothing right now
823 m
.d
.comb
+= nmoperator
.eq(self
.n
.data_o
, data_o
)
824 m
.d
.sync
+= nmoperator
.eq(buf
, self
.n
.data_o
)
829 class PassThroughHandshake(ControlBase
):
830 """ A control block that delays by one clock cycle.
832 Inputs Temporary Output Data
833 ------- ------------------ ----- ----
834 P P N N PiV& PiV| NiR| pvr N P (pvr)
835 i o i o PoR ~PoR ~NoV o o
839 0 0 0 0 0 1 1 0 1 1 odata (unchanged)
840 0 0 0 1 0 1 0 0 1 0 odata (unchanged)
841 0 0 1 0 0 1 1 0 1 1 odata (unchanged)
842 0 0 1 1 0 1 1 0 1 1 odata (unchanged)
844 0 1 0 0 0 0 1 0 0 1 odata (unchanged)
845 0 1 0 1 0 0 0 0 0 0 odata (unchanged)
846 0 1 1 0 0 0 1 0 0 1 odata (unchanged)
847 0 1 1 1 0 0 1 0 0 1 odata (unchanged)
849 1 0 0 0 0 1 1 1 1 1 process(in)
850 1 0 0 1 0 1 0 0 1 0 odata (unchanged)
851 1 0 1 0 0 1 1 1 1 1 process(in)
852 1 0 1 1 0 1 1 1 1 1 process(in)
854 1 1 0 0 1 1 1 1 1 1 process(in)
855 1 1 0 1 1 1 0 0 1 0 odata (unchanged)
856 1 1 1 0 1 1 1 1 1 1 process(in)
857 1 1 1 1 1 1 1 1 1 1 process(in)
862 def elaborate(self
, platform
):
863 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
865 r_data
= _spec(self
.stage
.ospec
, "r_tmp") # output type
868 p_valid_i
= Signal(reset_less
=True)
869 pvr
= Signal(reset_less
=True)
870 m
.d
.comb
+= p_valid_i
.eq(self
.p
.valid_i_test
)
871 m
.d
.comb
+= pvr
.eq(p_valid_i
& self
.p
.ready_o
)
873 m
.d
.comb
+= self
.p
.ready_o
.eq(~self
.n
.valid_o | self
.n
.ready_i_test
)
874 m
.d
.sync
+= self
.n
.valid_o
.eq(p_valid_i | ~self
.p
.ready_o
)
876 odata
= Mux(pvr
, self
.data_r
, r_data
)
877 m
.d
.sync
+= nmoperator
.eq(r_data
, odata
)
878 r_data
= self
._postprocess
(r_data
) # XXX TBD, does nothing right now
879 m
.d
.comb
+= nmoperator
.eq(self
.n
.data_o
, r_data
)
884 class RegisterPipeline(UnbufferedPipeline
):
885 """ A pipeline stage that delays by one clock cycle, creating a
886 sync'd latch out of data_o and valid_o as an indirect byproduct
887 of using PassThroughStage
889 def __init__(self
, iospecfn
):
890 UnbufferedPipeline
.__init
__(self
, PassThroughStage(iospecfn
))
893 class FIFOControl(ControlBase
):
894 """ FIFO Control. Uses Queue to store data, coincidentally
895 happens to have same valid/ready signalling as Stage API.
897 data_i -> fifo.din -> FIFO -> fifo.dout -> data_o
899 def __init__(self
, depth
, stage
, in_multi
=None, stage_ctl
=False,
900 fwft
=True, pipe
=False):
903 * :depth: number of entries in the FIFO
904 * :stage: data processing block
905 * :fwft: first word fall-thru mode (non-fwft introduces delay)
906 * :pipe: specifies pipe mode.
908 when fwft = True it indicates that transfers may occur
909 combinatorially through stage processing in the same clock cycle.
910 This requires that the Stage be a Moore FSM:
911 https://en.wikipedia.org/wiki/Moore_machine
913 when fwft = False it indicates that all output signals are
914 produced only from internal registers or memory, i.e. that the
915 Stage is a Mealy FSM:
916 https://en.wikipedia.org/wiki/Mealy_machine
918 data is processed (and located) as follows:
920 self.p self.stage temp fn temp fn temp fp self.n
921 data_i->process()->result->cat->din.FIFO.dout->cat(data_o)
923 yes, really: cat produces a Cat() which can be assigned to.
924 this is how the FIFO gets de-catted without needing a de-cat
930 ControlBase
.__init
__(self
, stage
, in_multi
, stage_ctl
)
932 def elaborate(self
, platform
):
933 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
935 # make a FIFO with a signal of equal width to the data_o.
936 (fwidth
, _
) = nmoperator
.shape(self
.n
.data_o
)
937 fifo
= Queue(fwidth
, self
.fdepth
, fwft
=self
.fwft
, pipe
=self
.pipe
)
938 m
.submodules
.fifo
= fifo
940 def processfn(data_i
):
941 # store result of processing in combinatorial temporary
942 result
= _spec(self
.stage
.ospec
, "r_temp")
943 m
.d
.comb
+= nmoperator
.eq(result
, self
.process(data_i
))
944 return nmoperator
.cat(result
)
946 ## prev: make the FIFO (Queue object) "look" like a PrevControl...
947 m
.submodules
.fp
= fp
= PrevControl()
948 fp
.valid_i
, fp
._ready
_o
, fp
.data_i
= fifo
.w_en
, fifo
.w_rdy
, fifo
.w_data
949 m
.d
.comb
+= fp
._connect
_in
(self
.p
, fn
=processfn
)
951 # next: make the FIFO (Queue object) "look" like a NextControl...
952 m
.submodules
.fn
= fn
= NextControl()
953 fn
.valid_o
, fn
.ready_i
, fn
.data_o
= fifo
.r_rdy
, fifo
.r_en
, fifo
.r_data
954 connections
= fn
._connect
_out
(self
.n
, fn
=nmoperator
.cat
)
955 valid_eq
, ready_eq
, data_o
= connections
957 # ok ok so we can't just do the ready/valid eqs straight:
958 # first 2 from connections are the ready/valid, 3rd is data.
960 m
.d
.comb
+= [valid_eq
, ready_eq
] # combinatorial on next ready/valid
962 m
.d
.sync
+= [valid_eq
, ready_eq
] # non-fwft mode needs sync
963 data_o
= self
._postprocess
(data_o
) # XXX TBD, does nothing right now
970 class UnbufferedPipeline(FIFOControl
):
971 def __init__(self
, stage
, in_multi
=None, stage_ctl
=False):
972 FIFOControl
.__init
__(self
, 1, stage
, in_multi
, stage_ctl
,
973 fwft
=True, pipe
=False)
975 # aka "BreakReadyStage" XXX had to set fwft=True to get it to work
976 class PassThroughHandshake(FIFOControl
):
977 def __init__(self
, stage
, in_multi
=None, stage_ctl
=False):
978 FIFOControl
.__init
__(self
, 1, stage
, in_multi
, stage_ctl
,
979 fwft
=True, pipe
=True)
981 # this is *probably* BufferedHandshake, although test #997 now succeeds.
982 class BufferedHandshake(FIFOControl
):
983 def __init__(self
, stage
, in_multi
=None, stage_ctl
=False):
984 FIFOControl
.__init
__(self
, 2, stage
, in_multi
, stage_ctl
,
985 fwft
=True, pipe
=False)
989 # this is *probably* SimpleHandshake (note: memory cell size=0)
990 class SimpleHandshake(FIFOControl):
991 def __init__(self, stage, in_multi=None, stage_ctl=False):
992 FIFOControl.__init__(self, 0, stage, in_multi, stage_ctl,
993 fwft=True, pipe=False)