1 """ Pipeline API. For multi-input and multi-output variants, see multipipe.
3 Associated development bugs:
4 * http://bugs.libre-riscv.org/show_bug.cgi?id=64
5 * http://bugs.libre-riscv.org/show_bug.cgi?id=57
7 Important: see Stage API (stageapi.py) and IO Control API
8 (iocontrol.py) in combination with below. This module
9 "combines" the Stage API with the IO Control API to create
12 The one critically important key difference between StageAPI and
15 * StageAPI: combinatorial (NO REGISTERS / LATCHES PERMITTED)
16 * PipelineAPI: synchronous registers / latches get added here
21 A convenience class that takes an input shape, output shape, a
22 "processing" function and an optional "setup" function. Honestly
23 though, there's not much more effort to just... create a class
24 that returns a couple of Records (see ExampleAddRecordStage in
30 A convenience class that takes a single function as a parameter,
31 that is chain-called to create the exact same input and output spec.
32 It has a process() function that simply returns its input.
34 Instances of this class are completely redundant if handed to
35 StageChain, however when passed to UnbufferedPipeline they
36 can be used to introduce a single clock delay.
41 The base class for pipelines. Contains previous and next ready/valid/data.
42 Also has an extremely useful "connect" function that can be used to
43 connect a chain of pipelines and present the exact same prev/next
46 Note: pipelines basically do not become pipelines as such until
47 handed to a derivative of ControlBase. ControlBase itself is *not*
48 strictly considered a pipeline class. Wishbone and AXI4 (master or
49 slave) could be derived from ControlBase, for example.
53 A simple stalling clock-synchronised pipeline that has no buffering
54 (unlike BufferedHandshake). Data flows on *every* clock cycle when
55 the conditions are right (this is nominally when the input is valid
56 and the output is ready).
58 A stall anywhere along the line will result in a stall back-propagating
59 down the entire chain. The BufferedHandshake by contrast will buffer
60 incoming data, allowing previous stages one clock cycle's grace before
63 An advantage of the UnbufferedPipeline over the Buffered one is
64 that the amount of logic needed (number of gates) is greatly
65 reduced (no second set of buffers basically)
67 The disadvantage of the UnbufferedPipeline is that the valid/ready
68 logic, if chained together, is *combinatorial*, resulting in
69 progressively larger gate delay.
74 A Control class that introduces a single clock delay, passing its
75 data through unaltered. Unlike RegisterPipeline (which relies
76 on UnbufferedPipeline and PassThroughStage) it handles ready/valid
82 A convenience class that, because UnbufferedPipeline introduces a single
83 clock delay, when its stage is a PassThroughStage, it results in a Pipeline
84 stage that, duh, delays its (unmodified) input by one clock cycle.
89 nmigen implementation of buffered pipeline stage, based on zipcpu:
90 https://zipcpu.com/blog/2017/08/14/strategies-for-pipelining.html
92 this module requires quite a bit of thought to understand how it works
93 (and why it is needed in the first place). reading the above is
94 *strongly* recommended.
96 unlike john dawson's IEEE754 FPU STB/ACK signalling, which requires
97 the STB / ACK signals to raise and lower (on separate clocks) before
98 data may proceeed (thus only allowing one piece of data to proceed
99 on *ALTERNATE* cycles), the signalling here is a true pipeline
100 where data will flow on *every* clock when the conditions are right.
102 input acceptance conditions are when:
103 * incoming previous-stage strobe (p.valid_i) is HIGH
104 * outgoing previous-stage ready (p.ready_o) is LOW
106 output transmission conditions are when:
107 * outgoing next-stage strobe (n.valid_o) is HIGH
108 * outgoing next-stage ready (n.ready_i) is LOW
110 the tricky bit is when the input has valid data and the output is not
111 ready to accept it. if it wasn't for the clock synchronisation, it
112 would be possible to tell the input "hey don't send that data, we're
113 not ready". unfortunately, it's not possible to "change the past":
114 the previous stage *has no choice* but to pass on its data.
116 therefore, the incoming data *must* be accepted - and stored: that
117 is the responsibility / contract that this stage *must* accept.
118 on the same clock, it's possible to tell the input that it must
119 not send any more data. this is the "stall" condition.
121 we now effectively have *two* possible pieces of data to "choose" from:
122 the buffered data, and the incoming data. the decision as to which
123 to process and output is based on whether we are in "stall" or not.
124 i.e. when the next stage is no longer ready, the output comes from
125 the buffer if a stall had previously occurred, otherwise it comes
126 direct from processing the input.
128 this allows us to respect a synchronous "travelling STB" with what
129 dan calls a "buffered handshake".
131 it's quite a complex state machine!
136 Synchronised pipeline, Based on:
137 https://github.com/ZipCPU/dbgbus/blob/master/hexbus/rtl/hbdeword.v
140 from nmigen
import Signal
, Mux
, Module
, Elaboratable
, Const
141 from nmigen
.cli
import verilog
, rtlil
142 from nmigen
.hdl
.rec
import Record
144 from nmutil
.queue
import Queue
147 from nmutil
.iocontrol
import (PrevControl
, NextControl
, Object
, RecordObject
)
148 from nmutil
.stageapi
import (_spec
, StageCls
, Stage
, StageChain
, StageHelper
)
149 from nmutil
import nmoperator
152 class RecordBasedStage(Stage
):
153 """ convenience class which provides a Records-based layout.
154 honestly it's a lot easier just to create a direct Records-based
155 class (see ExampleAddRecordStage)
157 def __init__(self
, in_shape
, out_shape
, processfn
, setupfn
=None):
158 self
.in_shape
= in_shape
159 self
.out_shape
= out_shape
160 self
.__process
= processfn
161 self
.__setup
= setupfn
162 def ispec(self
): return Record(self
.in_shape
)
163 def ospec(self
): return Record(self
.out_shape
)
164 def process(seif
, i
): return self
.__process
(i
)
165 def setup(seif
, m
, i
): return self
.__setup
(m
, i
)
168 class PassThroughStage(StageCls
):
169 """ a pass-through stage with its input data spec identical to its output,
170 and "passes through" its data from input to output (does nothing).
172 use this basically to explicitly make any data spec Stage-compliant.
173 (many APIs would potentially use a static "wrap" method in e.g.
174 StageCls to achieve a similar effect)
176 def __init__(self
, iospecfn
): self
.iospecfn
= iospecfn
177 def ispec(self
): return self
.iospecfn()
178 def ospec(self
): return self
.iospecfn()
181 class ControlBase(StageHelper
, Elaboratable
):
182 """ Common functions for Pipeline API. Note: a "pipeline stage" only
183 exists (conceptually) when a ControlBase derivative is handed
184 a Stage (combinatorial block)
186 NOTE: ControlBase derives from StageHelper, making it accidentally
187 compliant with the Stage API. Using those functions directly
188 *BYPASSES* a ControlBase instance ready/valid signalling, which
189 clearly should not be done without a really, really good reason.
191 def __init__(self
, stage
=None, in_multi
=None, stage_ctl
=False, maskwid
=0):
192 """ Base class containing ready/valid/data to previous and next stages
194 * p: contains ready/valid to the previous stage
195 * n: contains ready/valid to the next stage
197 Except when calling Controlbase.connect(), user must also:
198 * add data_i member to PrevControl (p) and
199 * add data_o member to NextControl (n)
200 Calling ControlBase._new_data is a good way to do that.
202 print ("ControlBase", self
, stage
, in_multi
, stage_ctl
)
203 StageHelper
.__init
__(self
, stage
)
205 # set up input and output IO ACK (prev/next ready/valid)
206 self
.p
= PrevControl(in_multi
, stage_ctl
, maskwid
=maskwid
)
207 self
.n
= NextControl(stage_ctl
, maskwid
=maskwid
)
209 # set up the input and output data
210 if stage
is not None:
211 self
._new
_data
("data")
213 def _new_data(self
, name
):
214 """ allocates new data_i and data_o
216 self
.p
.data_i
, self
.n
.data_o
= self
.new_specs(name
)
220 return self
.process(self
.p
.data_i
)
222 def connect_to_next(self
, nxt
):
223 """ helper function to connect to the next stage data/valid/ready.
225 return self
.n
.connect_to_next(nxt
.p
)
227 def _connect_in(self
, prev
):
228 """ internal helper function to connect stage to an input source.
229 do not use to connect stage-to-stage!
231 return self
.p
._connect
_in
(prev
.p
)
233 def _connect_out(self
, nxt
):
234 """ internal helper function to connect stage to an output source.
235 do not use to connect stage-to-stage!
237 return self
.n
._connect
_out
(nxt
.n
)
239 def connect(self
, pipechain
):
240 """ connects a chain (list) of Pipeline instances together and
241 links them to this ControlBase instance:
243 in <----> self <---> out
246 [pipe1, pipe2, pipe3, pipe4]
249 out---in out--in out---in
251 Also takes care of allocating data_i/data_o, by looking up
252 the data spec for each end of the pipechain. i.e It is NOT
253 necessary to allocate self.p.data_i or self.n.data_o manually:
254 this is handled AUTOMATICALLY, here.
256 Basically this function is the direct equivalent of StageChain,
257 except that unlike StageChain, the Pipeline logic is followed.
259 Just as StageChain presents an object that conforms to the
260 Stage API from a list of objects that also conform to the
261 Stage API, an object that calls this Pipeline connect function
262 has the exact same pipeline API as the list of pipline objects
265 Thus it becomes possible to build up larger chains recursively.
266 More complex chains (multi-input, multi-output) will have to be
271 * :pipechain: - a sequence of ControlBase-derived classes
272 (must be one or more in length)
276 * a list of eq assignments that will need to be added in
277 an elaborate() to m.d.comb
279 assert len(pipechain
) > 0, "pipechain must be non-zero length"
280 assert self
.stage
is None, "do not use connect with a stage"
281 eqs
= [] # collated list of assignment statements
283 # connect inter-chain
284 for i
in range(len(pipechain
)-1):
285 pipe1
= pipechain
[i
] # earlier
286 pipe2
= pipechain
[i
+1] # later (by 1)
287 eqs
+= pipe1
.connect_to_next(pipe2
) # earlier n to later p
289 # connect front and back of chain to ourselves
290 front
= pipechain
[0] # first in chain
291 end
= pipechain
[-1] # last in chain
292 self
.set_specs(front
, end
) # sets up ispec/ospec functions
293 self
._new
_data
("chain") # NOTE: REPLACES existing data
294 eqs
+= front
._connect
_in
(self
) # front p to our p
295 eqs
+= end
._connect
_out
(self
) # end n to our n
299 def set_input(self
, i
):
300 """ helper function to set the input data (used in unit tests)
302 return nmoperator
.eq(self
.p
.data_i
, i
)
305 yield from self
.p
# yields ready/valid/data (data also gets yielded)
306 yield from self
.n
# ditto
311 def elaborate(self
, platform
):
312 """ handles case where stage has dynamic ready/valid functions
315 m
.submodules
.p
= self
.p
316 m
.submodules
.n
= self
.n
318 self
.setup(m
, self
.p
.data_i
)
320 if not self
.p
.stage_ctl
:
323 # intercept the previous (outgoing) "ready", combine with stage ready
324 m
.d
.comb
+= self
.p
.s_ready_o
.eq(self
.p
._ready
_o
& self
.stage
.d_ready
)
326 # intercept the next (incoming) "ready" and combine it with data valid
327 sdv
= self
.stage
.d_valid(self
.n
.ready_i
)
328 m
.d
.comb
+= self
.n
.d_valid
.eq(self
.n
.ready_i
& sdv
)
333 class BufferedHandshake(ControlBase
):
334 """ buffered pipeline stage. data and strobe signals travel in sync.
335 if ever the input is ready and the output is not, processed data
336 is shunted in a temporary register.
338 Argument: stage. see Stage API above
340 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
341 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
342 stage-1 p.data_i >>in stage n.data_o out>> stage+1
348 input data p.data_i is read (only), is processed and goes into an
349 intermediate result store [process()]. this is updated combinatorially.
351 in a non-stall condition, the intermediate result will go into the
352 output (update_output). however if ever there is a stall, it goes
353 into r_data instead [update_buffer()].
355 when the non-stall condition is released, r_data is the first
356 to be transferred to the output [flush_buffer()], and the stall
359 on the next cycle (as long as stall is not raised again) the
360 input may begin to be processed and transferred directly to output.
363 def elaborate(self
, platform
):
364 self
.m
= ControlBase
.elaborate(self
, platform
)
366 result
= _spec(self
.stage
.ospec
, "r_tmp")
367 r_data
= _spec(self
.stage
.ospec
, "r_data")
369 # establish some combinatorial temporaries
370 o_n_validn
= Signal(reset_less
=True)
371 n_ready_i
= Signal(reset_less
=True, name
="n_i_rdy_data")
372 nir_por
= Signal(reset_less
=True)
373 nir_por_n
= Signal(reset_less
=True)
374 p_valid_i
= Signal(reset_less
=True)
375 nir_novn
= Signal(reset_less
=True)
376 nirn_novn
= Signal(reset_less
=True)
377 por_pivn
= Signal(reset_less
=True)
378 npnn
= Signal(reset_less
=True)
379 self
.m
.d
.comb
+= [p_valid_i
.eq(self
.p
.valid_i_test
),
380 o_n_validn
.eq(~self
.n
.valid_o
),
381 n_ready_i
.eq(self
.n
.ready_i_test
),
382 nir_por
.eq(n_ready_i
& self
.p
._ready
_o
),
383 nir_por_n
.eq(n_ready_i
& ~self
.p
._ready
_o
),
384 nir_novn
.eq(n_ready_i | o_n_validn
),
385 nirn_novn
.eq(~n_ready_i
& o_n_validn
),
386 npnn
.eq(nir_por | nirn_novn
),
387 por_pivn
.eq(self
.p
._ready
_o
& ~p_valid_i
)
390 # store result of processing in combinatorial temporary
391 self
.m
.d
.comb
+= nmoperator
.eq(result
, self
.data_r
)
393 # if not in stall condition, update the temporary register
394 with self
.m
.If(self
.p
.ready_o
): # not stalled
395 self
.m
.d
.sync
+= nmoperator
.eq(r_data
, result
) # update buffer
397 # data pass-through conditions
398 with self
.m
.If(npnn
):
399 data_o
= self
._postprocess
(result
) # XXX TBD, does nothing right now
400 self
.m
.d
.sync
+= [self
.n
.valid_o
.eq(p_valid_i
), # valid if p_valid
401 nmoperator
.eq(self
.n
.data_o
, data_o
), # update out
403 # buffer flush conditions (NOTE: can override data passthru conditions)
404 with self
.m
.If(nir_por_n
): # not stalled
405 # Flush the [already processed] buffer to the output port.
406 data_o
= self
._postprocess
(r_data
) # XXX TBD, does nothing right now
407 self
.m
.d
.sync
+= [self
.n
.valid_o
.eq(1), # reg empty
408 nmoperator
.eq(self
.n
.data_o
, data_o
), # flush
410 # output ready conditions
411 self
.m
.d
.sync
+= self
.p
._ready
_o
.eq(nir_novn | por_pivn
)
416 class MaskNoDelayCancellable(ControlBase
):
417 """ Mask-activated Cancellable pipeline (that does not respect "ready")
419 Based on (identical behaviour to) SimpleHandshake.
420 TODO: decide whether to merge *into* SimpleHandshake.
422 Argument: stage. see Stage API above
424 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
425 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
426 stage-1 p.data_i >>in stage n.data_o out>> stage+1
430 def __init__(self
, stage
, maskwid
, in_multi
=None, stage_ctl
=False):
431 ControlBase
.__init
__(self
, stage
, in_multi
, stage_ctl
, maskwid
)
433 def elaborate(self
, platform
):
434 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
436 # store result of processing in combinatorial temporary
437 result
= _spec(self
.stage
.ospec
, "r_tmp")
438 m
.d
.comb
+= nmoperator
.eq(result
, self
.data_r
)
440 # establish if the data should be passed on. cancellation is
442 # XXX EXCEPTIONAL CIRCUMSTANCES: inspection of the data payload
443 # is NOT "normal" for the Stage API.
444 p_valid_i
= Signal(reset_less
=True)
445 #print ("self.p.data_i", self.p.data_i)
446 maskedout
= Signal(len(self
.p
.mask_i
), reset_less
=True)
447 m
.d
.comb
+= maskedout
.eq(self
.p
.mask_i
& ~self
.p
.stop_i
)
448 m
.d
.comb
+= p_valid_i
.eq(maskedout
.bool())
450 # if idmask nonzero, mask gets passed on (and register set).
451 # register is left as-is if idmask is zero, but out-mask is set to zero
452 # note however: only the *uncancelled* mask bits get passed on
453 m
.d
.sync
+= self
.n
.valid_o
.eq(p_valid_i
)
454 m
.d
.sync
+= self
.n
.mask_o
.eq(Mux(p_valid_i
, maskedout
, 0))
455 with m
.If(p_valid_i
):
456 data_o
= self
._postprocess
(result
) # XXX TBD, does nothing right now
457 m
.d
.sync
+= nmoperator
.eq(self
.n
.data_o
, data_o
) # update output
460 # input always "ready"
461 #m.d.comb += self.p._ready_o.eq(self.n.ready_i_test)
462 m
.d
.comb
+= self
.p
._ready
_o
.eq(Const(1))
464 # always pass on stop (as combinatorial: single signal)
465 m
.d
.comb
+= self
.n
.stop_o
.eq(self
.p
.stop_i
)
470 class MaskCancellable(ControlBase
):
471 """ Mask-activated Cancellable pipeline
475 * stage. see Stage API above
476 * maskwid - sets up cancellation capability (mask and stop).
479 * dynamic - allows switching from sync to combinatorial (passthrough)
480 USE WITH CARE. will need the entire pipe to be quiescent
481 before switching, otherwise data WILL be destroyed.
483 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
484 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
485 stage-1 p.data_i >>in stage n.data_o out>> stage+1
489 def __init__(self
, stage
, maskwid
, in_multi
=None, stage_ctl
=False,
491 ControlBase
.__init
__(self
, stage
, in_multi
, stage_ctl
, maskwid
)
492 self
.dynamic
= dynamic
494 self
.latchmode
= Signal()
496 self
.latchmode
= Const(1)
498 def elaborate(self
, platform
):
499 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
501 mask_r
= Signal(len(self
.p
.mask_i
), reset_less
=True)
502 data_r
= _spec(self
.stage
.ospec
, "data_r")
503 m
.d
.comb
+= nmoperator
.eq(data_r
, self
._postprocess
(self
.data_r
))
505 with m
.If(self
.latchmode
):
507 r_latch
= _spec(self
.stage
.ospec
, "r_latch")
509 # establish if the data should be passed on. cancellation is
511 p_valid_i
= Signal(reset_less
=True)
512 #print ("self.p.data_i", self.p.data_i)
513 maskedout
= Signal(len(self
.p
.mask_i
), reset_less
=True)
514 m
.d
.comb
+= maskedout
.eq(self
.p
.mask_i
& ~self
.p
.stop_i
)
516 # establish some combinatorial temporaries
517 n_ready_i
= Signal(reset_less
=True, name
="n_i_rdy_data")
518 p_valid_i_p_ready_o
= Signal(reset_less
=True)
519 m
.d
.comb
+= [p_valid_i
.eq(self
.p
.valid_i_test
& maskedout
.bool()),
520 n_ready_i
.eq(self
.n
.ready_i_test
),
521 p_valid_i_p_ready_o
.eq(p_valid_i
& self
.p
.ready_o
),
524 # if idmask nonzero, mask gets passed on (and register set).
525 # register is left as-is if idmask is zero, but out-mask is set to
527 # note however: only the *uncancelled* mask bits get passed on
528 m
.d
.sync
+= mask_r
.eq(Mux(p_valid_i
, maskedout
, 0))
529 m
.d
.comb
+= self
.n
.mask_o
.eq(mask_r
)
531 # always pass on stop (as combinatorial: single signal)
532 m
.d
.comb
+= self
.n
.stop_o
.eq(self
.p
.stop_i
)
534 stor
= Signal(reset_less
=True)
535 m
.d
.comb
+= stor
.eq(p_valid_i_p_ready_o | n_ready_i
)
537 # store result of processing in combinatorial temporary
538 m
.d
.sync
+= nmoperator
.eq(r_latch
, data_r
)
540 # previous valid and ready
541 with m
.If(p_valid_i_p_ready_o
):
542 m
.d
.sync
+= r_busy
.eq(1) # output valid
543 # previous invalid or not ready, however next is accepting
544 with m
.Elif(n_ready_i
):
545 m
.d
.sync
+= r_busy
.eq(0) # ...so set output invalid
547 # output set combinatorially from latch
548 m
.d
.comb
+= nmoperator
.eq(self
.n
.data_o
, r_latch
)
550 m
.d
.comb
+= self
.n
.valid_o
.eq(r_busy
)
551 # if next is ready, so is previous
552 m
.d
.comb
+= self
.p
._ready
_o
.eq(n_ready_i
)
555 # pass everything straight through. p connected to n: data,
556 # valid, mask, everything. this is "effectively" just a
557 # StageChain: MaskCancellable is doing "nothing" except
558 # combinatorially passing everything through
559 # (except now it's *dynamically selectable* whether to do that)
560 m
.d
.comb
+= self
.n
.valid_o
.eq(self
.p
.valid_i_test
)
561 m
.d
.comb
+= self
.p
._ready
_o
.eq(self
.n
.ready_i_test
)
562 m
.d
.comb
+= self
.n
.stop_o
.eq(self
.p
.stop_i
)
563 m
.d
.comb
+= self
.n
.mask_o
.eq(self
.p
.mask_i
)
564 m
.d
.comb
+= nmoperator
.eq(self
.n
.data_o
, data_r
)
569 class SimpleHandshake(ControlBase
):
570 """ simple handshake control. data and strobe signals travel in sync.
571 implements the protocol used by Wishbone and AXI4.
573 Argument: stage. see Stage API above
575 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
576 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
577 stage-1 p.data_i >>in stage n.data_o out>> stage+1
582 Inputs Temporary Output Data
583 ------- ---------- ----- ----
584 P P N N PiV& ~NiR& N P
591 0 0 1 0 0 0 0 1 process(data_i)
592 0 0 1 1 0 0 0 1 process(data_i)
596 0 1 1 0 0 0 0 1 process(data_i)
597 0 1 1 1 0 0 0 1 process(data_i)
601 1 0 1 0 0 0 0 1 process(data_i)
602 1 0 1 1 0 0 0 1 process(data_i)
604 1 1 0 0 1 0 1 0 process(data_i)
605 1 1 0 1 1 1 1 0 process(data_i)
606 1 1 1 0 1 0 1 1 process(data_i)
607 1 1 1 1 1 0 1 1 process(data_i)
611 def elaborate(self
, platform
):
612 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
615 result
= _spec(self
.stage
.ospec
, "r_tmp")
617 # establish some combinatorial temporaries
618 n_ready_i
= Signal(reset_less
=True, name
="n_i_rdy_data")
619 p_valid_i_p_ready_o
= Signal(reset_less
=True)
620 p_valid_i
= Signal(reset_less
=True)
621 m
.d
.comb
+= [p_valid_i
.eq(self
.p
.valid_i_test
),
622 n_ready_i
.eq(self
.n
.ready_i_test
),
623 p_valid_i_p_ready_o
.eq(p_valid_i
& self
.p
.ready_o
),
626 # store result of processing in combinatorial temporary
627 m
.d
.comb
+= nmoperator
.eq(result
, self
.data_r
)
629 # previous valid and ready
630 with m
.If(p_valid_i_p_ready_o
):
631 data_o
= self
._postprocess
(result
) # XXX TBD, does nothing right now
632 m
.d
.sync
+= [r_busy
.eq(1), # output valid
633 nmoperator
.eq(self
.n
.data_o
, data_o
), # update output
635 # previous invalid or not ready, however next is accepting
636 with m
.Elif(n_ready_i
):
637 data_o
= self
._postprocess
(result
) # XXX TBD, does nothing right now
638 m
.d
.sync
+= [nmoperator
.eq(self
.n
.data_o
, data_o
)]
639 # TODO: could still send data here (if there was any)
640 #m.d.sync += self.n.valid_o.eq(0) # ...so set output invalid
641 m
.d
.sync
+= r_busy
.eq(0) # ...so set output invalid
643 m
.d
.comb
+= self
.n
.valid_o
.eq(r_busy
)
644 # if next is ready, so is previous
645 m
.d
.comb
+= self
.p
._ready
_o
.eq(n_ready_i
)
650 class UnbufferedPipeline(ControlBase
):
651 """ A simple pipeline stage with single-clock synchronisation
652 and two-way valid/ready synchronised signalling.
654 Note that a stall in one stage will result in the entire pipeline
657 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
658 travel synchronously with the data: the valid/ready signalling
659 combines in a *combinatorial* fashion. Therefore, a long pipeline
660 chain will lengthen propagation delays.
662 Argument: stage. see Stage API, above
664 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
665 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
666 stage-1 p.data_i >>in stage n.data_o out>> stage+1
674 p.data_i : StageInput, shaped according to ispec
676 p.data_o : StageOutput, shaped according to ospec
678 r_data : input_shape according to ispec
679 A temporary (buffered) copy of a prior (valid) input.
680 This is HELD if the output is not ready. It is updated
682 result: output_shape according to ospec
683 The output of the combinatorial logic. it is updated
684 COMBINATORIALLY (no clock dependence).
688 Inputs Temp Output Data
710 1 1 0 0 0 1 1 process(data_i)
711 1 1 0 1 1 1 0 process(data_i)
712 1 1 1 0 0 1 1 process(data_i)
713 1 1 1 1 0 1 1 process(data_i)
716 Note: PoR is *NOT* involved in the above decision-making.
719 def elaborate(self
, platform
):
720 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
722 data_valid
= Signal() # is data valid or not
723 r_data
= _spec(self
.stage
.ospec
, "r_tmp") # output type
726 p_valid_i
= Signal(reset_less
=True)
727 pv
= Signal(reset_less
=True)
728 buf_full
= Signal(reset_less
=True)
729 m
.d
.comb
+= p_valid_i
.eq(self
.p
.valid_i_test
)
730 m
.d
.comb
+= pv
.eq(self
.p
.valid_i
& self
.p
.ready_o
)
731 m
.d
.comb
+= buf_full
.eq(~self
.n
.ready_i_test
& data_valid
)
733 m
.d
.comb
+= self
.n
.valid_o
.eq(data_valid
)
734 m
.d
.comb
+= self
.p
._ready
_o
.eq(~data_valid | self
.n
.ready_i_test
)
735 m
.d
.sync
+= data_valid
.eq(p_valid_i | buf_full
)
738 m
.d
.sync
+= nmoperator
.eq(r_data
, self
.data_r
)
739 data_o
= self
._postprocess
(r_data
) # XXX TBD, does nothing right now
740 m
.d
.comb
+= nmoperator
.eq(self
.n
.data_o
, data_o
)
745 class UnbufferedPipeline2(ControlBase
):
746 """ A simple pipeline stage with single-clock synchronisation
747 and two-way valid/ready synchronised signalling.
749 Note that a stall in one stage will result in the entire pipeline
752 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
753 travel synchronously with the data: the valid/ready signalling
754 combines in a *combinatorial* fashion. Therefore, a long pipeline
755 chain will lengthen propagation delays.
757 Argument: stage. see Stage API, above
759 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
760 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
761 stage-1 p.data_i >>in stage n.data_o out>> stage+1
766 p.data_i : StageInput, shaped according to ispec
768 p.data_o : StageOutput, shaped according to ospec
770 buf : output_shape according to ospec
771 A temporary (buffered) copy of a valid output
772 This is HELD if the output is not ready. It is updated
775 Inputs Temp Output Data
777 P P N N ~NiR& N P (buf_full)
782 0 0 0 0 0 0 1 process(data_i)
783 0 0 0 1 1 1 0 reg (odata, unchanged)
784 0 0 1 0 0 0 1 process(data_i)
785 0 0 1 1 0 0 1 process(data_i)
787 0 1 0 0 0 0 1 process(data_i)
788 0 1 0 1 1 1 0 reg (odata, unchanged)
789 0 1 1 0 0 0 1 process(data_i)
790 0 1 1 1 0 0 1 process(data_i)
792 1 0 0 0 0 1 1 process(data_i)
793 1 0 0 1 1 1 0 reg (odata, unchanged)
794 1 0 1 0 0 1 1 process(data_i)
795 1 0 1 1 0 1 1 process(data_i)
797 1 1 0 0 0 1 1 process(data_i)
798 1 1 0 1 1 1 0 reg (odata, unchanged)
799 1 1 1 0 0 1 1 process(data_i)
800 1 1 1 1 0 1 1 process(data_i)
803 Note: PoR is *NOT* involved in the above decision-making.
806 def elaborate(self
, platform
):
807 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
809 buf_full
= Signal() # is data valid or not
810 buf
= _spec(self
.stage
.ospec
, "r_tmp") # output type
813 p_valid_i
= Signal(reset_less
=True)
814 m
.d
.comb
+= p_valid_i
.eq(self
.p
.valid_i_test
)
816 m
.d
.comb
+= self
.n
.valid_o
.eq(buf_full | p_valid_i
)
817 m
.d
.comb
+= self
.p
._ready
_o
.eq(~buf_full
)
818 m
.d
.sync
+= buf_full
.eq(~self
.n
.ready_i_test
& self
.n
.valid_o
)
820 data_o
= Mux(buf_full
, buf
, self
.data_r
)
821 data_o
= self
._postprocess
(data_o
) # XXX TBD, does nothing right now
822 m
.d
.comb
+= nmoperator
.eq(self
.n
.data_o
, data_o
)
823 m
.d
.sync
+= nmoperator
.eq(buf
, self
.n
.data_o
)
828 class PassThroughHandshake(ControlBase
):
829 """ A control block that delays by one clock cycle.
831 Inputs Temporary Output Data
832 ------- ------------------ ----- ----
833 P P N N PiV& PiV| NiR| pvr N P (pvr)
834 i o i o PoR ~PoR ~NoV o o
838 0 0 0 0 0 1 1 0 1 1 odata (unchanged)
839 0 0 0 1 0 1 0 0 1 0 odata (unchanged)
840 0 0 1 0 0 1 1 0 1 1 odata (unchanged)
841 0 0 1 1 0 1 1 0 1 1 odata (unchanged)
843 0 1 0 0 0 0 1 0 0 1 odata (unchanged)
844 0 1 0 1 0 0 0 0 0 0 odata (unchanged)
845 0 1 1 0 0 0 1 0 0 1 odata (unchanged)
846 0 1 1 1 0 0 1 0 0 1 odata (unchanged)
848 1 0 0 0 0 1 1 1 1 1 process(in)
849 1 0 0 1 0 1 0 0 1 0 odata (unchanged)
850 1 0 1 0 0 1 1 1 1 1 process(in)
851 1 0 1 1 0 1 1 1 1 1 process(in)
853 1 1 0 0 1 1 1 1 1 1 process(in)
854 1 1 0 1 1 1 0 0 1 0 odata (unchanged)
855 1 1 1 0 1 1 1 1 1 1 process(in)
856 1 1 1 1 1 1 1 1 1 1 process(in)
861 def elaborate(self
, platform
):
862 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
864 r_data
= _spec(self
.stage
.ospec
, "r_tmp") # output type
867 p_valid_i
= Signal(reset_less
=True)
868 pvr
= Signal(reset_less
=True)
869 m
.d
.comb
+= p_valid_i
.eq(self
.p
.valid_i_test
)
870 m
.d
.comb
+= pvr
.eq(p_valid_i
& self
.p
.ready_o
)
872 m
.d
.comb
+= self
.p
.ready_o
.eq(~self
.n
.valid_o | self
.n
.ready_i_test
)
873 m
.d
.sync
+= self
.n
.valid_o
.eq(p_valid_i | ~self
.p
.ready_o
)
875 odata
= Mux(pvr
, self
.data_r
, r_data
)
876 m
.d
.sync
+= nmoperator
.eq(r_data
, odata
)
877 r_data
= self
._postprocess
(r_data
) # XXX TBD, does nothing right now
878 m
.d
.comb
+= nmoperator
.eq(self
.n
.data_o
, r_data
)
883 class RegisterPipeline(UnbufferedPipeline
):
884 """ A pipeline stage that delays by one clock cycle, creating a
885 sync'd latch out of data_o and valid_o as an indirect byproduct
886 of using PassThroughStage
888 def __init__(self
, iospecfn
):
889 UnbufferedPipeline
.__init
__(self
, PassThroughStage(iospecfn
))
892 class FIFOControl(ControlBase
):
893 """ FIFO Control. Uses Queue to store data, coincidentally
894 happens to have same valid/ready signalling as Stage API.
896 data_i -> fifo.din -> FIFO -> fifo.dout -> data_o
898 def __init__(self
, depth
, stage
, in_multi
=None, stage_ctl
=False,
899 fwft
=True, pipe
=False):
902 * :depth: number of entries in the FIFO
903 * :stage: data processing block
904 * :fwft: first word fall-thru mode (non-fwft introduces delay)
905 * :pipe: specifies pipe mode.
907 when fwft = True it indicates that transfers may occur
908 combinatorially through stage processing in the same clock cycle.
909 This requires that the Stage be a Moore FSM:
910 https://en.wikipedia.org/wiki/Moore_machine
912 when fwft = False it indicates that all output signals are
913 produced only from internal registers or memory, i.e. that the
914 Stage is a Mealy FSM:
915 https://en.wikipedia.org/wiki/Mealy_machine
917 data is processed (and located) as follows:
919 self.p self.stage temp fn temp fn temp fp self.n
920 data_i->process()->result->cat->din.FIFO.dout->cat(data_o)
922 yes, really: cat produces a Cat() which can be assigned to.
923 this is how the FIFO gets de-catted without needing a de-cat
929 ControlBase
.__init
__(self
, stage
, in_multi
, stage_ctl
)
931 def elaborate(self
, platform
):
932 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
934 # make a FIFO with a signal of equal width to the data_o.
935 (fwidth
, _
) = nmoperator
.shape(self
.n
.data_o
)
936 fifo
= Queue(fwidth
, self
.fdepth
, fwft
=self
.fwft
, pipe
=self
.pipe
)
937 m
.submodules
.fifo
= fifo
939 def processfn(data_i
):
940 # store result of processing in combinatorial temporary
941 result
= _spec(self
.stage
.ospec
, "r_temp")
942 m
.d
.comb
+= nmoperator
.eq(result
, self
.process(data_i
))
943 return nmoperator
.cat(result
)
945 ## prev: make the FIFO (Queue object) "look" like a PrevControl...
946 m
.submodules
.fp
= fp
= PrevControl()
947 fp
.valid_i
, fp
._ready
_o
, fp
.data_i
= fifo
.we
, fifo
.writable
, fifo
.din
948 m
.d
.comb
+= fp
._connect
_in
(self
.p
, fn
=processfn
)
950 # next: make the FIFO (Queue object) "look" like a NextControl...
951 m
.submodules
.fn
= fn
= NextControl()
952 fn
.valid_o
, fn
.ready_i
, fn
.data_o
= fifo
.readable
, fifo
.re
, fifo
.dout
953 connections
= fn
._connect
_out
(self
.n
, fn
=nmoperator
.cat
)
954 valid_eq
, ready_eq
, data_o
= connections
956 # ok ok so we can't just do the ready/valid eqs straight:
957 # first 2 from connections are the ready/valid, 3rd is data.
959 m
.d
.comb
+= [valid_eq
, ready_eq
] # combinatorial on next ready/valid
961 m
.d
.sync
+= [valid_eq
, ready_eq
] # non-fwft mode needs sync
962 data_o
= self
._postprocess
(data_o
) # XXX TBD, does nothing right now
969 class UnbufferedPipeline(FIFOControl
):
970 def __init__(self
, stage
, in_multi
=None, stage_ctl
=False):
971 FIFOControl
.__init
__(self
, 1, stage
, in_multi
, stage_ctl
,
972 fwft
=True, pipe
=False)
974 # aka "BreakReadyStage" XXX had to set fwft=True to get it to work
975 class PassThroughHandshake(FIFOControl
):
976 def __init__(self
, stage
, in_multi
=None, stage_ctl
=False):
977 FIFOControl
.__init
__(self
, 1, stage
, in_multi
, stage_ctl
,
978 fwft
=True, pipe
=True)
980 # this is *probably* BufferedHandshake, although test #997 now succeeds.
981 class BufferedHandshake(FIFOControl
):
982 def __init__(self
, stage
, in_multi
=None, stage_ctl
=False):
983 FIFOControl
.__init
__(self
, 2, stage
, in_multi
, stage_ctl
,
984 fwft
=True, pipe
=False)
988 # this is *probably* SimpleHandshake (note: memory cell size=0)
989 class SimpleHandshake(FIFOControl):
990 def __init__(self, stage, in_multi=None, stage_ctl=False):
991 FIFOControl.__init__(self, 0, stage, in_multi, stage_ctl,
992 fwft=True, pipe=False)