1 """ Pipeline API. For multi-input and multi-output variants, see multipipe.
3 This work is funded through NLnet under Grant 2019-02-012
8 Associated development bugs:
9 * http://bugs.libre-riscv.org/show_bug.cgi?id=148
10 * http://bugs.libre-riscv.org/show_bug.cgi?id=64
11 * http://bugs.libre-riscv.org/show_bug.cgi?id=57
13 Important: see Stage API (stageapi.py) and IO Control API
14 (iocontrol.py) in combination with below. This module
15 "combines" the Stage API with the IO Control API to create
18 The one critically important key difference between StageAPI and
21 * StageAPI: combinatorial (NO REGISTERS / LATCHES PERMITTED)
22 * PipelineAPI: synchronous registers / latches get added here
27 A convenience class that takes an input shape, output shape, a
28 "processing" function and an optional "setup" function. Honestly
29 though, there's not much more effort to just... create a class
30 that returns a couple of Records (see ExampleAddRecordStage in
36 A convenience class that takes a single function as a parameter,
37 that is chain-called to create the exact same input and output spec.
38 It has a process() function that simply returns its input.
40 Instances of this class are completely redundant if handed to
41 StageChain, however when passed to UnbufferedPipeline they
42 can be used to introduce a single clock delay.
47 The base class for pipelines. Contains previous and next ready/valid/data.
48 Also has an extremely useful "connect" function that can be used to
49 connect a chain of pipelines and present the exact same prev/next
52 Note: pipelines basically do not become pipelines as such until
53 handed to a derivative of ControlBase. ControlBase itself is *not*
54 strictly considered a pipeline class. Wishbone and AXI4 (master or
55 slave) could be derived from ControlBase, for example.
59 A simple stalling clock-synchronised pipeline that has no buffering
60 (unlike BufferedHandshake). Data flows on *every* clock cycle when
61 the conditions are right (this is nominally when the input is valid
62 and the output is ready).
64 A stall anywhere along the line will result in a stall back-propagating
65 down the entire chain. The BufferedHandshake by contrast will buffer
66 incoming data, allowing previous stages one clock cycle's grace before
69 An advantage of the UnbufferedPipeline over the Buffered one is
70 that the amount of logic needed (number of gates) is greatly
71 reduced (no second set of buffers basically)
73 The disadvantage of the UnbufferedPipeline is that the valid/ready
74 logic, if chained together, is *combinatorial*, resulting in
75 progressively larger gate delay.
80 A Control class that introduces a single clock delay, passing its
81 data through unaltered. Unlike RegisterPipeline (which relies
82 on UnbufferedPipeline and PassThroughStage) it handles ready/valid
88 A convenience class that, because UnbufferedPipeline introduces a single
89 clock delay, when its stage is a PassThroughStage, it results in a Pipeline
90 stage that, duh, delays its (unmodified) input by one clock cycle.
95 nmigen implementation of buffered pipeline stage, based on zipcpu:
96 https://zipcpu.com/blog/2017/08/14/strategies-for-pipelining.html
98 this module requires quite a bit of thought to understand how it works
99 (and why it is needed in the first place). reading the above is
100 *strongly* recommended.
102 unlike john dawson's IEEE754 FPU STB/ACK signalling, which requires
103 the STB / ACK signals to raise and lower (on separate clocks) before
104 data may proceeed (thus only allowing one piece of data to proceed
105 on *ALTERNATE* cycles), the signalling here is a true pipeline
106 where data will flow on *every* clock when the conditions are right.
108 input acceptance conditions are when:
109 * incoming previous-stage strobe (p.valid_i) is HIGH
110 * outgoing previous-stage ready (p.ready_o) is LOW
112 output transmission conditions are when:
113 * outgoing next-stage strobe (n.valid_o) is HIGH
114 * outgoing next-stage ready (n.ready_i) is LOW
116 the tricky bit is when the input has valid data and the output is not
117 ready to accept it. if it wasn't for the clock synchronisation, it
118 would be possible to tell the input "hey don't send that data, we're
119 not ready". unfortunately, it's not possible to "change the past":
120 the previous stage *has no choice* but to pass on its data.
122 therefore, the incoming data *must* be accepted - and stored: that
123 is the responsibility / contract that this stage *must* accept.
124 on the same clock, it's possible to tell the input that it must
125 not send any more data. this is the "stall" condition.
127 we now effectively have *two* possible pieces of data to "choose" from:
128 the buffered data, and the incoming data. the decision as to which
129 to process and output is based on whether we are in "stall" or not.
130 i.e. when the next stage is no longer ready, the output comes from
131 the buffer if a stall had previously occurred, otherwise it comes
132 direct from processing the input.
134 this allows us to respect a synchronous "travelling STB" with what
135 dan calls a "buffered handshake".
137 it's quite a complex state machine!
142 Synchronised pipeline, Based on:
143 https://github.com/ZipCPU/dbgbus/blob/master/hexbus/rtl/hbdeword.v
146 from nmigen
import Signal
, Mux
, Module
, Elaboratable
, Const
147 from nmigen
.cli
import verilog
, rtlil
148 from nmigen
.hdl
.rec
import Record
150 from nmutil
.queue
import Queue
153 from nmutil
.iocontrol
import (PrevControl
, NextControl
, Object
, RecordObject
)
154 from nmutil
.stageapi
import (_spec
, StageCls
, Stage
, StageChain
, StageHelper
)
155 from nmutil
import nmoperator
158 class RecordBasedStage(Stage
):
159 """ convenience class which provides a Records-based layout.
160 honestly it's a lot easier just to create a direct Records-based
161 class (see ExampleAddRecordStage)
163 def __init__(self
, in_shape
, out_shape
, processfn
, setupfn
=None):
164 self
.in_shape
= in_shape
165 self
.out_shape
= out_shape
166 self
.__process
= processfn
167 self
.__setup
= setupfn
168 def ispec(self
): return Record(self
.in_shape
)
169 def ospec(self
): return Record(self
.out_shape
)
170 def process(seif
, i
): return self
.__process
(i
)
171 def setup(seif
, m
, i
): return self
.__setup
(m
, i
)
174 class PassThroughStage(StageCls
):
175 """ a pass-through stage with its input data spec identical to its output,
176 and "passes through" its data from input to output (does nothing).
178 use this basically to explicitly make any data spec Stage-compliant.
179 (many APIs would potentially use a static "wrap" method in e.g.
180 StageCls to achieve a similar effect)
182 def __init__(self
, iospecfn
): self
.iospecfn
= iospecfn
183 def ispec(self
): return self
.iospecfn()
184 def ospec(self
): return self
.iospecfn()
187 class ControlBase(StageHelper
, Elaboratable
):
188 """ Common functions for Pipeline API. Note: a "pipeline stage" only
189 exists (conceptually) when a ControlBase derivative is handed
190 a Stage (combinatorial block)
192 NOTE: ControlBase derives from StageHelper, making it accidentally
193 compliant with the Stage API. Using those functions directly
194 *BYPASSES* a ControlBase instance ready/valid signalling, which
195 clearly should not be done without a really, really good reason.
197 def __init__(self
, stage
=None, in_multi
=None, stage_ctl
=False, maskwid
=0):
198 """ Base class containing ready/valid/data to previous and next stages
200 * p: contains ready/valid to the previous stage
201 * n: contains ready/valid to the next stage
203 Except when calling Controlbase.connect(), user must also:
204 * add data_i member to PrevControl (p) and
205 * add data_o member to NextControl (n)
206 Calling ControlBase._new_data is a good way to do that.
208 print ("ControlBase", self
, stage
, in_multi
, stage_ctl
)
209 StageHelper
.__init
__(self
, stage
)
211 # set up input and output IO ACK (prev/next ready/valid)
212 self
.p
= PrevControl(in_multi
, stage_ctl
, maskwid
=maskwid
)
213 self
.n
= NextControl(stage_ctl
, maskwid
=maskwid
)
215 # set up the input and output data
216 if stage
is not None:
217 self
._new
_data
("data")
219 def _new_data(self
, name
):
220 """ allocates new data_i and data_o
222 self
.p
.data_i
, self
.n
.data_o
= self
.new_specs(name
)
226 return self
.process(self
.p
.data_i
)
228 def connect_to_next(self
, nxt
):
229 """ helper function to connect to the next stage data/valid/ready.
231 return self
.n
.connect_to_next(nxt
.p
)
233 def _connect_in(self
, prev
):
234 """ internal helper function to connect stage to an input source.
235 do not use to connect stage-to-stage!
237 return self
.p
._connect
_in
(prev
.p
)
239 def _connect_out(self
, nxt
):
240 """ internal helper function to connect stage to an output source.
241 do not use to connect stage-to-stage!
243 return self
.n
._connect
_out
(nxt
.n
)
245 def connect(self
, pipechain
):
246 """ connects a chain (list) of Pipeline instances together and
247 links them to this ControlBase instance:
249 in <----> self <---> out
252 [pipe1, pipe2, pipe3, pipe4]
255 out---in out--in out---in
257 Also takes care of allocating data_i/data_o, by looking up
258 the data spec for each end of the pipechain. i.e It is NOT
259 necessary to allocate self.p.data_i or self.n.data_o manually:
260 this is handled AUTOMATICALLY, here.
262 Basically this function is the direct equivalent of StageChain,
263 except that unlike StageChain, the Pipeline logic is followed.
265 Just as StageChain presents an object that conforms to the
266 Stage API from a list of objects that also conform to the
267 Stage API, an object that calls this Pipeline connect function
268 has the exact same pipeline API as the list of pipline objects
271 Thus it becomes possible to build up larger chains recursively.
272 More complex chains (multi-input, multi-output) will have to be
277 * :pipechain: - a sequence of ControlBase-derived classes
278 (must be one or more in length)
282 * a list of eq assignments that will need to be added in
283 an elaborate() to m.d.comb
285 assert len(pipechain
) > 0, "pipechain must be non-zero length"
286 assert self
.stage
is None, "do not use connect with a stage"
287 eqs
= [] # collated list of assignment statements
289 # connect inter-chain
290 for i
in range(len(pipechain
)-1):
291 pipe1
= pipechain
[i
] # earlier
292 pipe2
= pipechain
[i
+1] # later (by 1)
293 eqs
+= pipe1
.connect_to_next(pipe2
) # earlier n to later p
295 # connect front and back of chain to ourselves
296 front
= pipechain
[0] # first in chain
297 end
= pipechain
[-1] # last in chain
298 self
.set_specs(front
, end
) # sets up ispec/ospec functions
299 self
._new
_data
("chain") # NOTE: REPLACES existing data
300 eqs
+= front
._connect
_in
(self
) # front p to our p
301 eqs
+= end
._connect
_out
(self
) # end n to our n
305 def set_input(self
, i
):
306 """ helper function to set the input data (used in unit tests)
308 return nmoperator
.eq(self
.p
.data_i
, i
)
311 yield from self
.p
# yields ready/valid/data (data also gets yielded)
312 yield from self
.n
# ditto
317 def elaborate(self
, platform
):
318 """ handles case where stage has dynamic ready/valid functions
321 m
.submodules
.p
= self
.p
322 m
.submodules
.n
= self
.n
324 self
.setup(m
, self
.p
.data_i
)
326 if not self
.p
.stage_ctl
:
329 # intercept the previous (outgoing) "ready", combine with stage ready
330 m
.d
.comb
+= self
.p
.s_ready_o
.eq(self
.p
._ready
_o
& self
.stage
.d_ready
)
332 # intercept the next (incoming) "ready" and combine it with data valid
333 sdv
= self
.stage
.d_valid(self
.n
.ready_i
)
334 m
.d
.comb
+= self
.n
.d_valid
.eq(self
.n
.ready_i
& sdv
)
339 class BufferedHandshake(ControlBase
):
340 """ buffered pipeline stage. data and strobe signals travel in sync.
341 if ever the input is ready and the output is not, processed data
342 is shunted in a temporary register.
344 Argument: stage. see Stage API above
346 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
347 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
348 stage-1 p.data_i >>in stage n.data_o out>> stage+1
354 input data p.data_i is read (only), is processed and goes into an
355 intermediate result store [process()]. this is updated combinatorially.
357 in a non-stall condition, the intermediate result will go into the
358 output (update_output). however if ever there is a stall, it goes
359 into r_data instead [update_buffer()].
361 when the non-stall condition is released, r_data is the first
362 to be transferred to the output [flush_buffer()], and the stall
365 on the next cycle (as long as stall is not raised again) the
366 input may begin to be processed and transferred directly to output.
369 def elaborate(self
, platform
):
370 self
.m
= ControlBase
.elaborate(self
, platform
)
372 result
= _spec(self
.stage
.ospec
, "r_tmp")
373 r_data
= _spec(self
.stage
.ospec
, "r_data")
375 # establish some combinatorial temporaries
376 o_n_validn
= Signal(reset_less
=True)
377 n_ready_i
= Signal(reset_less
=True, name
="n_i_rdy_data")
378 nir_por
= Signal(reset_less
=True)
379 nir_por_n
= Signal(reset_less
=True)
380 p_valid_i
= Signal(reset_less
=True)
381 nir_novn
= Signal(reset_less
=True)
382 nirn_novn
= Signal(reset_less
=True)
383 por_pivn
= Signal(reset_less
=True)
384 npnn
= Signal(reset_less
=True)
385 self
.m
.d
.comb
+= [p_valid_i
.eq(self
.p
.valid_i_test
),
386 o_n_validn
.eq(~self
.n
.valid_o
),
387 n_ready_i
.eq(self
.n
.ready_i_test
),
388 nir_por
.eq(n_ready_i
& self
.p
._ready
_o
),
389 nir_por_n
.eq(n_ready_i
& ~self
.p
._ready
_o
),
390 nir_novn
.eq(n_ready_i | o_n_validn
),
391 nirn_novn
.eq(~n_ready_i
& o_n_validn
),
392 npnn
.eq(nir_por | nirn_novn
),
393 por_pivn
.eq(self
.p
._ready
_o
& ~p_valid_i
)
396 # store result of processing in combinatorial temporary
397 self
.m
.d
.comb
+= nmoperator
.eq(result
, self
.data_r
)
399 # if not in stall condition, update the temporary register
400 with self
.m
.If(self
.p
.ready_o
): # not stalled
401 self
.m
.d
.sync
+= nmoperator
.eq(r_data
, result
) # update buffer
403 # data pass-through conditions
404 with self
.m
.If(npnn
):
405 data_o
= self
._postprocess
(result
) # XXX TBD, does nothing right now
406 self
.m
.d
.sync
+= [self
.n
.valid_o
.eq(p_valid_i
), # valid if p_valid
407 nmoperator
.eq(self
.n
.data_o
, data_o
), # update out
409 # buffer flush conditions (NOTE: can override data passthru conditions)
410 with self
.m
.If(nir_por_n
): # not stalled
411 # Flush the [already processed] buffer to the output port.
412 data_o
= self
._postprocess
(r_data
) # XXX TBD, does nothing right now
413 self
.m
.d
.sync
+= [self
.n
.valid_o
.eq(1), # reg empty
414 nmoperator
.eq(self
.n
.data_o
, data_o
), # flush
416 # output ready conditions
417 self
.m
.d
.sync
+= self
.p
._ready
_o
.eq(nir_novn | por_pivn
)
422 class MaskNoDelayCancellable(ControlBase
):
423 """ Mask-activated Cancellable pipeline (that does not respect "ready")
425 Based on (identical behaviour to) SimpleHandshake.
426 TODO: decide whether to merge *into* SimpleHandshake.
428 Argument: stage. see Stage API above
430 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
431 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
432 stage-1 p.data_i >>in stage n.data_o out>> stage+1
436 def __init__(self
, stage
, maskwid
, in_multi
=None, stage_ctl
=False):
437 ControlBase
.__init
__(self
, stage
, in_multi
, stage_ctl
, maskwid
)
439 def elaborate(self
, platform
):
440 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
442 # store result of processing in combinatorial temporary
443 result
= _spec(self
.stage
.ospec
, "r_tmp")
444 m
.d
.comb
+= nmoperator
.eq(result
, self
.data_r
)
446 # establish if the data should be passed on. cancellation is
448 # XXX EXCEPTIONAL CIRCUMSTANCES: inspection of the data payload
449 # is NOT "normal" for the Stage API.
450 p_valid_i
= Signal(reset_less
=True)
451 #print ("self.p.data_i", self.p.data_i)
452 maskedout
= Signal(len(self
.p
.mask_i
), reset_less
=True)
453 m
.d
.comb
+= maskedout
.eq(self
.p
.mask_i
& ~self
.p
.stop_i
)
454 m
.d
.comb
+= p_valid_i
.eq(maskedout
.bool())
456 # if idmask nonzero, mask gets passed on (and register set).
457 # register is left as-is if idmask is zero, but out-mask is set to zero
458 # note however: only the *uncancelled* mask bits get passed on
459 m
.d
.sync
+= self
.n
.valid_o
.eq(p_valid_i
)
460 m
.d
.sync
+= self
.n
.mask_o
.eq(Mux(p_valid_i
, maskedout
, 0))
461 with m
.If(p_valid_i
):
462 data_o
= self
._postprocess
(result
) # XXX TBD, does nothing right now
463 m
.d
.sync
+= nmoperator
.eq(self
.n
.data_o
, data_o
) # update output
466 # input always "ready"
467 #m.d.comb += self.p._ready_o.eq(self.n.ready_i_test)
468 m
.d
.comb
+= self
.p
._ready
_o
.eq(Const(1))
470 # always pass on stop (as combinatorial: single signal)
471 m
.d
.comb
+= self
.n
.stop_o
.eq(self
.p
.stop_i
)
476 class MaskCancellable(ControlBase
):
477 """ Mask-activated Cancellable pipeline
481 * stage. see Stage API above
482 * maskwid - sets up cancellation capability (mask and stop).
485 * dynamic - allows switching from sync to combinatorial (passthrough)
486 USE WITH CARE. will need the entire pipe to be quiescent
487 before switching, otherwise data WILL be destroyed.
489 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
490 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
491 stage-1 p.data_i >>in stage n.data_o out>> stage+1
495 def __init__(self
, stage
, maskwid
, in_multi
=None, stage_ctl
=False,
497 ControlBase
.__init
__(self
, stage
, in_multi
, stage_ctl
, maskwid
)
498 self
.dynamic
= dynamic
500 self
.latchmode
= Signal()
502 self
.latchmode
= Const(1)
504 def elaborate(self
, platform
):
505 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
507 mask_r
= Signal(len(self
.p
.mask_i
), reset_less
=True)
508 data_r
= _spec(self
.stage
.ospec
, "data_r")
509 m
.d
.comb
+= nmoperator
.eq(data_r
, self
._postprocess
(self
.data_r
))
511 with m
.If(self
.latchmode
):
513 r_latch
= _spec(self
.stage
.ospec
, "r_latch")
515 # establish if the data should be passed on. cancellation is
517 p_valid_i
= Signal(reset_less
=True)
518 #print ("self.p.data_i", self.p.data_i)
519 maskedout
= Signal(len(self
.p
.mask_i
), reset_less
=True)
520 m
.d
.comb
+= maskedout
.eq(self
.p
.mask_i
& ~self
.p
.stop_i
)
522 # establish some combinatorial temporaries
523 n_ready_i
= Signal(reset_less
=True, name
="n_i_rdy_data")
524 p_valid_i_p_ready_o
= Signal(reset_less
=True)
525 m
.d
.comb
+= [p_valid_i
.eq(self
.p
.valid_i_test
& maskedout
.bool()),
526 n_ready_i
.eq(self
.n
.ready_i_test
),
527 p_valid_i_p_ready_o
.eq(p_valid_i
& self
.p
.ready_o
),
530 # if idmask nonzero, mask gets passed on (and register set).
531 # register is left as-is if idmask is zero, but out-mask is set to
533 # note however: only the *uncancelled* mask bits get passed on
534 m
.d
.sync
+= mask_r
.eq(Mux(p_valid_i
, maskedout
, 0))
535 m
.d
.comb
+= self
.n
.mask_o
.eq(mask_r
)
537 # always pass on stop (as combinatorial: single signal)
538 m
.d
.comb
+= self
.n
.stop_o
.eq(self
.p
.stop_i
)
540 stor
= Signal(reset_less
=True)
541 m
.d
.comb
+= stor
.eq(p_valid_i_p_ready_o | n_ready_i
)
543 # store result of processing in combinatorial temporary
544 m
.d
.sync
+= nmoperator
.eq(r_latch
, data_r
)
546 # previous valid and ready
547 with m
.If(p_valid_i_p_ready_o
):
548 m
.d
.sync
+= r_busy
.eq(1) # output valid
549 # previous invalid or not ready, however next is accepting
550 with m
.Elif(n_ready_i
):
551 m
.d
.sync
+= r_busy
.eq(0) # ...so set output invalid
553 # output set combinatorially from latch
554 m
.d
.comb
+= nmoperator
.eq(self
.n
.data_o
, r_latch
)
556 m
.d
.comb
+= self
.n
.valid_o
.eq(r_busy
)
557 # if next is ready, so is previous
558 m
.d
.comb
+= self
.p
._ready
_o
.eq(n_ready_i
)
561 # pass everything straight through. p connected to n: data,
562 # valid, mask, everything. this is "effectively" just a
563 # StageChain: MaskCancellable is doing "nothing" except
564 # combinatorially passing everything through
565 # (except now it's *dynamically selectable* whether to do that)
566 m
.d
.comb
+= self
.n
.valid_o
.eq(self
.p
.valid_i_test
)
567 m
.d
.comb
+= self
.p
._ready
_o
.eq(self
.n
.ready_i_test
)
568 m
.d
.comb
+= self
.n
.stop_o
.eq(self
.p
.stop_i
)
569 m
.d
.comb
+= self
.n
.mask_o
.eq(self
.p
.mask_i
)
570 m
.d
.comb
+= nmoperator
.eq(self
.n
.data_o
, data_r
)
575 class SimpleHandshake(ControlBase
):
576 """ simple handshake control. data and strobe signals travel in sync.
577 implements the protocol used by Wishbone and AXI4.
579 Argument: stage. see Stage API above
581 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
582 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
583 stage-1 p.data_i >>in stage n.data_o out>> stage+1
588 Inputs Temporary Output Data
589 ------- ---------- ----- ----
590 P P N N PiV& ~NiR& N P
597 0 0 1 0 0 0 0 1 process(data_i)
598 0 0 1 1 0 0 0 1 process(data_i)
602 0 1 1 0 0 0 0 1 process(data_i)
603 0 1 1 1 0 0 0 1 process(data_i)
607 1 0 1 0 0 0 0 1 process(data_i)
608 1 0 1 1 0 0 0 1 process(data_i)
610 1 1 0 0 1 0 1 0 process(data_i)
611 1 1 0 1 1 1 1 0 process(data_i)
612 1 1 1 0 1 0 1 1 process(data_i)
613 1 1 1 1 1 0 1 1 process(data_i)
617 def elaborate(self
, platform
):
618 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
621 result
= _spec(self
.stage
.ospec
, "r_tmp")
623 # establish some combinatorial temporaries
624 n_ready_i
= Signal(reset_less
=True, name
="n_i_rdy_data")
625 p_valid_i_p_ready_o
= Signal(reset_less
=True)
626 p_valid_i
= Signal(reset_less
=True)
627 m
.d
.comb
+= [p_valid_i
.eq(self
.p
.valid_i_test
),
628 n_ready_i
.eq(self
.n
.ready_i_test
),
629 p_valid_i_p_ready_o
.eq(p_valid_i
& self
.p
.ready_o
),
632 # store result of processing in combinatorial temporary
633 m
.d
.comb
+= nmoperator
.eq(result
, self
.data_r
)
635 # previous valid and ready
636 with m
.If(p_valid_i_p_ready_o
):
637 data_o
= self
._postprocess
(result
) # XXX TBD, does nothing right now
638 m
.d
.sync
+= [r_busy
.eq(1), # output valid
639 nmoperator
.eq(self
.n
.data_o
, data_o
), # update output
641 # previous invalid or not ready, however next is accepting
642 with m
.Elif(n_ready_i
):
643 data_o
= self
._postprocess
(result
) # XXX TBD, does nothing right now
644 m
.d
.sync
+= [nmoperator
.eq(self
.n
.data_o
, data_o
)]
645 # TODO: could still send data here (if there was any)
646 #m.d.sync += self.n.valid_o.eq(0) # ...so set output invalid
647 m
.d
.sync
+= r_busy
.eq(0) # ...so set output invalid
649 m
.d
.comb
+= self
.n
.valid_o
.eq(r_busy
)
650 # if next is ready, so is previous
651 m
.d
.comb
+= self
.p
._ready
_o
.eq(n_ready_i
)
656 class UnbufferedPipeline(ControlBase
):
657 """ A simple pipeline stage with single-clock synchronisation
658 and two-way valid/ready synchronised signalling.
660 Note that a stall in one stage will result in the entire pipeline
663 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
664 travel synchronously with the data: the valid/ready signalling
665 combines in a *combinatorial* fashion. Therefore, a long pipeline
666 chain will lengthen propagation delays.
668 Argument: stage. see Stage API, above
670 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
671 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
672 stage-1 p.data_i >>in stage n.data_o out>> stage+1
680 p.data_i : StageInput, shaped according to ispec
682 p.data_o : StageOutput, shaped according to ospec
684 r_data : input_shape according to ispec
685 A temporary (buffered) copy of a prior (valid) input.
686 This is HELD if the output is not ready. It is updated
688 result: output_shape according to ospec
689 The output of the combinatorial logic. it is updated
690 COMBINATORIALLY (no clock dependence).
694 Inputs Temp Output Data
716 1 1 0 0 0 1 1 process(data_i)
717 1 1 0 1 1 1 0 process(data_i)
718 1 1 1 0 0 1 1 process(data_i)
719 1 1 1 1 0 1 1 process(data_i)
722 Note: PoR is *NOT* involved in the above decision-making.
725 def elaborate(self
, platform
):
726 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
728 data_valid
= Signal() # is data valid or not
729 r_data
= _spec(self
.stage
.ospec
, "r_tmp") # output type
732 p_valid_i
= Signal(reset_less
=True)
733 pv
= Signal(reset_less
=True)
734 buf_full
= Signal(reset_less
=True)
735 m
.d
.comb
+= p_valid_i
.eq(self
.p
.valid_i_test
)
736 m
.d
.comb
+= pv
.eq(self
.p
.valid_i
& self
.p
.ready_o
)
737 m
.d
.comb
+= buf_full
.eq(~self
.n
.ready_i_test
& data_valid
)
739 m
.d
.comb
+= self
.n
.valid_o
.eq(data_valid
)
740 m
.d
.comb
+= self
.p
._ready
_o
.eq(~data_valid | self
.n
.ready_i_test
)
741 m
.d
.sync
+= data_valid
.eq(p_valid_i | buf_full
)
744 m
.d
.sync
+= nmoperator
.eq(r_data
, self
.data_r
)
745 data_o
= self
._postprocess
(r_data
) # XXX TBD, does nothing right now
746 m
.d
.comb
+= nmoperator
.eq(self
.n
.data_o
, data_o
)
751 class UnbufferedPipeline2(ControlBase
):
752 """ A simple pipeline stage with single-clock synchronisation
753 and two-way valid/ready synchronised signalling.
755 Note that a stall in one stage will result in the entire pipeline
758 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
759 travel synchronously with the data: the valid/ready signalling
760 combines in a *combinatorial* fashion. Therefore, a long pipeline
761 chain will lengthen propagation delays.
763 Argument: stage. see Stage API, above
765 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
766 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
767 stage-1 p.data_i >>in stage n.data_o out>> stage+1
772 p.data_i : StageInput, shaped according to ispec
774 p.data_o : StageOutput, shaped according to ospec
776 buf : output_shape according to ospec
777 A temporary (buffered) copy of a valid output
778 This is HELD if the output is not ready. It is updated
781 Inputs Temp Output Data
783 P P N N ~NiR& N P (buf_full)
788 0 0 0 0 0 0 1 process(data_i)
789 0 0 0 1 1 1 0 reg (odata, unchanged)
790 0 0 1 0 0 0 1 process(data_i)
791 0 0 1 1 0 0 1 process(data_i)
793 0 1 0 0 0 0 1 process(data_i)
794 0 1 0 1 1 1 0 reg (odata, unchanged)
795 0 1 1 0 0 0 1 process(data_i)
796 0 1 1 1 0 0 1 process(data_i)
798 1 0 0 0 0 1 1 process(data_i)
799 1 0 0 1 1 1 0 reg (odata, unchanged)
800 1 0 1 0 0 1 1 process(data_i)
801 1 0 1 1 0 1 1 process(data_i)
803 1 1 0 0 0 1 1 process(data_i)
804 1 1 0 1 1 1 0 reg (odata, unchanged)
805 1 1 1 0 0 1 1 process(data_i)
806 1 1 1 1 0 1 1 process(data_i)
809 Note: PoR is *NOT* involved in the above decision-making.
812 def elaborate(self
, platform
):
813 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
815 buf_full
= Signal() # is data valid or not
816 buf
= _spec(self
.stage
.ospec
, "r_tmp") # output type
819 p_valid_i
= Signal(reset_less
=True)
820 m
.d
.comb
+= p_valid_i
.eq(self
.p
.valid_i_test
)
822 m
.d
.comb
+= self
.n
.valid_o
.eq(buf_full | p_valid_i
)
823 m
.d
.comb
+= self
.p
._ready
_o
.eq(~buf_full
)
824 m
.d
.sync
+= buf_full
.eq(~self
.n
.ready_i_test
& self
.n
.valid_o
)
826 data_o
= Mux(buf_full
, buf
, self
.data_r
)
827 data_o
= self
._postprocess
(data_o
) # XXX TBD, does nothing right now
828 m
.d
.comb
+= nmoperator
.eq(self
.n
.data_o
, data_o
)
829 m
.d
.sync
+= nmoperator
.eq(buf
, self
.n
.data_o
)
834 class PassThroughHandshake(ControlBase
):
835 """ A control block that delays by one clock cycle.
837 Inputs Temporary Output Data
838 ------- ------------------ ----- ----
839 P P N N PiV& PiV| NiR| pvr N P (pvr)
840 i o i o PoR ~PoR ~NoV o o
844 0 0 0 0 0 1 1 0 1 1 odata (unchanged)
845 0 0 0 1 0 1 0 0 1 0 odata (unchanged)
846 0 0 1 0 0 1 1 0 1 1 odata (unchanged)
847 0 0 1 1 0 1 1 0 1 1 odata (unchanged)
849 0 1 0 0 0 0 1 0 0 1 odata (unchanged)
850 0 1 0 1 0 0 0 0 0 0 odata (unchanged)
851 0 1 1 0 0 0 1 0 0 1 odata (unchanged)
852 0 1 1 1 0 0 1 0 0 1 odata (unchanged)
854 1 0 0 0 0 1 1 1 1 1 process(in)
855 1 0 0 1 0 1 0 0 1 0 odata (unchanged)
856 1 0 1 0 0 1 1 1 1 1 process(in)
857 1 0 1 1 0 1 1 1 1 1 process(in)
859 1 1 0 0 1 1 1 1 1 1 process(in)
860 1 1 0 1 1 1 0 0 1 0 odata (unchanged)
861 1 1 1 0 1 1 1 1 1 1 process(in)
862 1 1 1 1 1 1 1 1 1 1 process(in)
867 def elaborate(self
, platform
):
868 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
870 r_data
= _spec(self
.stage
.ospec
, "r_tmp") # output type
873 p_valid_i
= Signal(reset_less
=True)
874 pvr
= Signal(reset_less
=True)
875 m
.d
.comb
+= p_valid_i
.eq(self
.p
.valid_i_test
)
876 m
.d
.comb
+= pvr
.eq(p_valid_i
& self
.p
.ready_o
)
878 m
.d
.comb
+= self
.p
.ready_o
.eq(~self
.n
.valid_o | self
.n
.ready_i_test
)
879 m
.d
.sync
+= self
.n
.valid_o
.eq(p_valid_i | ~self
.p
.ready_o
)
881 odata
= Mux(pvr
, self
.data_r
, r_data
)
882 m
.d
.sync
+= nmoperator
.eq(r_data
, odata
)
883 r_data
= self
._postprocess
(r_data
) # XXX TBD, does nothing right now
884 m
.d
.comb
+= nmoperator
.eq(self
.n
.data_o
, r_data
)
889 class RegisterPipeline(UnbufferedPipeline
):
890 """ A pipeline stage that delays by one clock cycle, creating a
891 sync'd latch out of data_o and valid_o as an indirect byproduct
892 of using PassThroughStage
894 def __init__(self
, iospecfn
):
895 UnbufferedPipeline
.__init
__(self
, PassThroughStage(iospecfn
))
898 class FIFOControl(ControlBase
):
899 """ FIFO Control. Uses Queue to store data, coincidentally
900 happens to have same valid/ready signalling as Stage API.
902 data_i -> fifo.din -> FIFO -> fifo.dout -> data_o
904 def __init__(self
, depth
, stage
, in_multi
=None, stage_ctl
=False,
905 fwft
=True, pipe
=False):
908 * :depth: number of entries in the FIFO
909 * :stage: data processing block
910 * :fwft: first word fall-thru mode (non-fwft introduces delay)
911 * :pipe: specifies pipe mode.
913 when fwft = True it indicates that transfers may occur
914 combinatorially through stage processing in the same clock cycle.
915 This requires that the Stage be a Moore FSM:
916 https://en.wikipedia.org/wiki/Moore_machine
918 when fwft = False it indicates that all output signals are
919 produced only from internal registers or memory, i.e. that the
920 Stage is a Mealy FSM:
921 https://en.wikipedia.org/wiki/Mealy_machine
923 data is processed (and located) as follows:
925 self.p self.stage temp fn temp fn temp fp self.n
926 data_i->process()->result->cat->din.FIFO.dout->cat(data_o)
928 yes, really: cat produces a Cat() which can be assigned to.
929 this is how the FIFO gets de-catted without needing a de-cat
935 ControlBase
.__init
__(self
, stage
, in_multi
, stage_ctl
)
937 def elaborate(self
, platform
):
938 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
940 # make a FIFO with a signal of equal width to the data_o.
941 (fwidth
, _
) = nmoperator
.shape(self
.n
.data_o
)
942 fifo
= Queue(fwidth
, self
.fdepth
, fwft
=self
.fwft
, pipe
=self
.pipe
)
943 m
.submodules
.fifo
= fifo
945 def processfn(data_i
):
946 # store result of processing in combinatorial temporary
947 result
= _spec(self
.stage
.ospec
, "r_temp")
948 m
.d
.comb
+= nmoperator
.eq(result
, self
.process(data_i
))
949 return nmoperator
.cat(result
)
951 ## prev: make the FIFO (Queue object) "look" like a PrevControl...
952 m
.submodules
.fp
= fp
= PrevControl()
953 fp
.valid_i
, fp
._ready
_o
, fp
.data_i
= fifo
.w_en
, fifo
.w_rdy
, fifo
.w_data
954 m
.d
.comb
+= fp
._connect
_in
(self
.p
, fn
=processfn
)
956 # next: make the FIFO (Queue object) "look" like a NextControl...
957 m
.submodules
.fn
= fn
= NextControl()
958 fn
.valid_o
, fn
.ready_i
, fn
.data_o
= fifo
.r_rdy
, fifo
.r_en
, fifo
.r_data
959 connections
= fn
._connect
_out
(self
.n
, fn
=nmoperator
.cat
)
960 valid_eq
, ready_eq
, data_o
= connections
962 # ok ok so we can't just do the ready/valid eqs straight:
963 # first 2 from connections are the ready/valid, 3rd is data.
965 m
.d
.comb
+= [valid_eq
, ready_eq
] # combinatorial on next ready/valid
967 m
.d
.sync
+= [valid_eq
, ready_eq
] # non-fwft mode needs sync
968 data_o
= self
._postprocess
(data_o
) # XXX TBD, does nothing right now
975 class UnbufferedPipeline(FIFOControl
):
976 def __init__(self
, stage
, in_multi
=None, stage_ctl
=False):
977 FIFOControl
.__init
__(self
, 1, stage
, in_multi
, stage_ctl
,
978 fwft
=True, pipe
=False)
980 # aka "BreakReadyStage" XXX had to set fwft=True to get it to work
981 class PassThroughHandshake(FIFOControl
):
982 def __init__(self
, stage
, in_multi
=None, stage_ctl
=False):
983 FIFOControl
.__init
__(self
, 1, stage
, in_multi
, stage_ctl
,
984 fwft
=True, pipe
=True)
986 # this is *probably* BufferedHandshake, although test #997 now succeeds.
987 class BufferedHandshake(FIFOControl
):
988 def __init__(self
, stage
, in_multi
=None, stage_ctl
=False):
989 FIFOControl
.__init
__(self
, 2, stage
, in_multi
, stage_ctl
,
990 fwft
=True, pipe
=False)
994 # this is *probably* SimpleHandshake (note: memory cell size=0)
995 class SimpleHandshake(FIFOControl):
996 def __init__(self, stage, in_multi=None, stage_ctl=False):
997 FIFOControl.__init__(self, 0, stage, in_multi, stage_ctl,
998 fwft=True, pipe=False)