1 """ Pipeline and BufferedHandshake implementation, conforming to the same API.
2 For multi-input and multi-output variants, see multipipe.
4 Associated development bugs:
5 * http://bugs.libre-riscv.org/show_bug.cgi?id=64
6 * http://bugs.libre-riscv.org/show_bug.cgi?id=57
11 a strategically very important function that is identical in function
12 to nmigen's Signal.eq function, except it may take objects, or a list
13 of objects, or a tuple of objects, and where objects may also be
19 stage requires compliance with a strict API that may be
20 implemented in several means, including as a static class.
21 the methods of a stage instance must be as follows:
23 * ispec() - Input data format specification
24 returns an object or a list or tuple of objects, or
25 a Record, each object having an "eq" function which
26 takes responsibility for copying by assignment all
28 * ospec() - Output data format specification
29 requirements as for ospec
30 * process(m, i) - Processes an ispec-formatted object
31 returns a combinatorial block of a result that
32 may be assigned to the output, by way of the "eq"
34 * setup(m, i) - Optional function for setting up submodules
35 may be used for more complex stages, to link
36 the input (i) to submodules. must take responsibility
37 for adding those submodules to the module (m).
38 the submodules must be combinatorial blocks and
39 must have their inputs and output linked combinatorially.
41 Both StageCls (for use with non-static classes) and Stage (for use
42 by static classes) are abstract classes from which, for convenience
43 and as a courtesy to other developers, anything conforming to the
44 Stage API may *choose* to derive.
49 A useful combinatorial wrapper around stages that chains them together
50 and then presents a Stage-API-conformant interface. By presenting
51 the same API as the stages it wraps, it can clearly be used recursively.
56 A convenience class that takes an input shape, output shape, a
57 "processing" function and an optional "setup" function. Honestly
58 though, there's not much more effort to just... create a class
59 that returns a couple of Records (see ExampleAddRecordStage in
65 A convenience class that takes a single function as a parameter,
66 that is chain-called to create the exact same input and output spec.
67 It has a process() function that simply returns its input.
69 Instances of this class are completely redundant if handed to
70 StageChain, however when passed to UnbufferedPipeline they
71 can be used to introduce a single clock delay.
76 The base class for pipelines. Contains previous and next ready/valid/data.
77 Also has an extremely useful "connect" function that can be used to
78 connect a chain of pipelines and present the exact same prev/next
84 A simple stalling clock-synchronised pipeline that has no buffering
85 (unlike BufferedHandshake). Data flows on *every* clock cycle when
86 the conditions are right (this is nominally when the input is valid
87 and the output is ready).
89 A stall anywhere along the line will result in a stall back-propagating
90 down the entire chain. The BufferedHandshake by contrast will buffer
91 incoming data, allowing previous stages one clock cycle's grace before
94 An advantage of the UnbufferedPipeline over the Buffered one is
95 that the amount of logic needed (number of gates) is greatly
96 reduced (no second set of buffers basically)
98 The disadvantage of the UnbufferedPipeline is that the valid/ready
99 logic, if chained together, is *combinatorial*, resulting in
100 progressively larger gate delay.
102 PassThroughHandshake:
105 A Control class that introduces a single clock delay, passing its
106 data through unaltered. Unlike RegisterPipeline (which relies
107 on UnbufferedPipeline and PassThroughStage) it handles ready/valid
113 A convenience class that, because UnbufferedPipeline introduces a single
114 clock delay, when its stage is a PassThroughStage, it results in a Pipeline
115 stage that, duh, delays its (unmodified) input by one clock cycle.
120 nmigen implementation of buffered pipeline stage, based on zipcpu:
121 https://zipcpu.com/blog/2017/08/14/strategies-for-pipelining.html
123 this module requires quite a bit of thought to understand how it works
124 (and why it is needed in the first place). reading the above is
125 *strongly* recommended.
127 unlike john dawson's IEEE754 FPU STB/ACK signalling, which requires
128 the STB / ACK signals to raise and lower (on separate clocks) before
129 data may proceeed (thus only allowing one piece of data to proceed
130 on *ALTERNATE* cycles), the signalling here is a true pipeline
131 where data will flow on *every* clock when the conditions are right.
133 input acceptance conditions are when:
134 * incoming previous-stage strobe (p.valid_i) is HIGH
135 * outgoing previous-stage ready (p.ready_o) is LOW
137 output transmission conditions are when:
138 * outgoing next-stage strobe (n.valid_o) is HIGH
139 * outgoing next-stage ready (n.ready_i) is LOW
141 the tricky bit is when the input has valid data and the output is not
142 ready to accept it. if it wasn't for the clock synchronisation, it
143 would be possible to tell the input "hey don't send that data, we're
144 not ready". unfortunately, it's not possible to "change the past":
145 the previous stage *has no choice* but to pass on its data.
147 therefore, the incoming data *must* be accepted - and stored: that
148 is the responsibility / contract that this stage *must* accept.
149 on the same clock, it's possible to tell the input that it must
150 not send any more data. this is the "stall" condition.
152 we now effectively have *two* possible pieces of data to "choose" from:
153 the buffered data, and the incoming data. the decision as to which
154 to process and output is based on whether we are in "stall" or not.
155 i.e. when the next stage is no longer ready, the output comes from
156 the buffer if a stall had previously occurred, otherwise it comes
157 direct from processing the input.
159 this allows us to respect a synchronous "travelling STB" with what
160 dan calls a "buffered handshake".
162 it's quite a complex state machine!
167 Synchronised pipeline, Based on:
168 https://github.com/ZipCPU/dbgbus/blob/master/hexbus/rtl/hbdeword.v
171 from nmigen
import Signal
, Cat
, Const
, Mux
, Module
, Value
, Elaboratable
172 from nmigen
.cli
import verilog
, rtlil
173 from nmigen
.lib
.fifo
import SyncFIFO
, SyncFIFOBuffered
174 from nmigen
.hdl
.ast
import ArrayProxy
175 from nmigen
.hdl
.rec
import Record
, Layout
177 from abc
import ABCMeta
, abstractmethod
178 from collections
.abc
import Sequence
, Iterable
179 from collections
import OrderedDict
180 from queue
import Queue
183 from nmoperator
import eq
, cat
, shape
188 self
.fields
= OrderedDict()
190 def __setattr__(self
, k
, v
):
192 if (k
.startswith('_') or k
in ["fields", "name", "src_loc"] or
193 k
in dir(Object
) or "fields" not in self
.__dict
__):
194 return object.__setattr
__(self
, k
, v
)
197 def __getattr__(self
, k
):
198 if k
in self
.__dict
__:
199 return object.__getattr
__(self
, k
)
201 return self
.fields
[k
]
202 except KeyError as e
:
203 raise AttributeError(e
)
206 for x
in self
.fields
.values():
207 if isinstance(x
, Iterable
):
214 for (k
, o
) in self
.fields
.items():
218 if isinstance(rres
, Sequence
):
229 class RecordObject(Record
):
230 def __init__(self
, layout
=None, name
=None):
231 Record
.__init
__(self
, layout
=layout
or [], name
=None)
233 def __setattr__(self
, k
, v
):
235 if (k
.startswith('_') or k
in ["fields", "name", "src_loc"] or
236 k
in dir(Record
) or "fields" not in self
.__dict
__):
237 return object.__setattr
__(self
, k
, v
)
239 #print ("RecordObject setattr", k, v)
240 if isinstance(v
, Record
):
241 newlayout
= {k
: (k
, v
.layout
)}
242 elif isinstance(v
, Value
):
243 newlayout
= {k
: (k
, v
.shape())}
245 newlayout
= {k
: (k
, shape(v
))}
246 self
.layout
.fields
.update(newlayout
)
249 for x
in self
.fields
.values():
250 if isinstance(x
, Iterable
):
259 def _spec(fn
, name
=None):
262 varnames
= dict(inspect
.getmembers(fn
.__code
__))['co_varnames']
263 if 'name' in varnames
:
268 class PrevControl(Elaboratable
):
269 """ contains signals that come *from* the previous stage (both in and out)
270 * valid_i: previous stage indicating all incoming data is valid.
271 may be a multi-bit signal, where all bits are required
272 to be asserted to indicate "valid".
273 * ready_o: output to next stage indicating readiness to accept data
274 * data_i : an input - added by the user of this class
277 def __init__(self
, i_width
=1, stage_ctl
=False):
278 self
.stage_ctl
= stage_ctl
279 self
.valid_i
= Signal(i_width
, name
="p_valid_i") # prev >>in self
280 self
._ready
_o
= Signal(name
="p_ready_o") # prev <<out self
281 self
.data_i
= None # XXX MUST BE ADDED BY USER
283 self
.s_ready_o
= Signal(name
="p_s_o_rdy") # prev <<out self
284 self
.trigger
= Signal(reset_less
=True)
288 """ public-facing API: indicates (externally) that stage is ready
291 return self
.s_ready_o
# set dynamically by stage
292 return self
._ready
_o
# return this when not under dynamic control
294 def _connect_in(self
, prev
, direct
=False, fn
=None):
295 """ internal helper function to connect stage to an input source.
296 do not use to connect stage-to-stage!
298 valid_i
= prev
.valid_i
if direct
else prev
.valid_i_test
299 data_i
= fn(prev
.data_i
) if fn
is not None else prev
.data_i
300 return [self
.valid_i
.eq(valid_i
),
301 prev
.ready_o
.eq(self
.ready_o
),
302 eq(self
.data_i
, data_i
),
306 def valid_i_test(self
):
307 vlen
= len(self
.valid_i
)
309 # multi-bit case: valid only when valid_i is all 1s
310 all1s
= Const(-1, (len(self
.valid_i
), False))
311 valid_i
= (self
.valid_i
== all1s
)
313 # single-bit valid_i case
314 valid_i
= self
.valid_i
316 # when stage indicates not ready, incoming data
317 # must "appear" to be not ready too
319 valid_i
= valid_i
& self
.s_ready_o
323 def elaborate(self
, platform
):
325 m
.d
.comb
+= self
.trigger
.eq(self
.valid_i_test
& self
.ready_o
)
329 return [self
.data_i
.eq(i
.data_i
),
330 self
.ready_o
.eq(i
.ready_o
),
331 self
.valid_i
.eq(i
.valid_i
)]
336 if hasattr(self
.data_i
, "ports"):
337 yield from self
.data_i
.ports()
338 elif isinstance(self
.data_i
, Sequence
):
339 yield from self
.data_i
347 class NextControl(Elaboratable
):
348 """ contains the signals that go *to* the next stage (both in and out)
349 * valid_o: output indicating to next stage that data is valid
350 * ready_i: input from next stage indicating that it can accept data
351 * data_o : an output - added by the user of this class
353 def __init__(self
, stage_ctl
=False):
354 self
.stage_ctl
= stage_ctl
355 self
.valid_o
= Signal(name
="n_valid_o") # self out>> next
356 self
.ready_i
= Signal(name
="n_ready_i") # self <<in next
357 self
.data_o
= None # XXX MUST BE ADDED BY USER
359 self
.d_valid
= Signal(reset
=1) # INTERNAL (data valid)
360 self
.trigger
= Signal(reset_less
=True)
363 def ready_i_test(self
):
365 return self
.ready_i
& self
.d_valid
368 def connect_to_next(self
, nxt
):
369 """ helper function to connect to the next stage data/valid/ready.
370 data/valid is passed *TO* nxt, and ready comes *IN* from nxt.
371 use this when connecting stage-to-stage
373 return [nxt
.valid_i
.eq(self
.valid_o
),
374 self
.ready_i
.eq(nxt
.ready_o
),
375 eq(nxt
.data_i
, self
.data_o
),
378 def _connect_out(self
, nxt
, direct
=False, fn
=None):
379 """ internal helper function to connect stage to an output source.
380 do not use to connect stage-to-stage!
382 ready_i
= nxt
.ready_i
if direct
else nxt
.ready_i_test
383 data_o
= fn(nxt
.data_o
) if fn
is not None else nxt
.data_o
384 return [nxt
.valid_o
.eq(self
.valid_o
),
385 self
.ready_i
.eq(ready_i
),
386 eq(data_o
, self
.data_o
),
389 def elaborate(self
, platform
):
391 m
.d
.comb
+= self
.trigger
.eq(self
.ready_i_test
& self
.valid_o
)
397 if hasattr(self
.data_o
, "ports"):
398 yield from self
.data_o
.ports()
399 elif isinstance(self
.data_o
, Sequence
):
400 yield from self
.data_o
408 class StageCls(metaclass
=ABCMeta
):
409 """ Class-based "Stage" API. requires instantiation (after derivation)
411 see "Stage API" above.. Note: python does *not* require derivation
412 from this class. All that is required is that the pipelines *have*
413 the functions listed in this class. Derivation from this class
414 is therefore merely a "courtesy" to maintainers.
417 def ispec(self
): pass # REQUIRED
419 def ospec(self
): pass # REQUIRED
421 #def setup(self, m, i): pass # OPTIONAL
423 def process(self
, i
): pass # REQUIRED
426 class Stage(metaclass
=ABCMeta
):
427 """ Static "Stage" API. does not require instantiation (after derivation)
429 see "Stage API" above. Note: python does *not* require derivation
430 from this class. All that is required is that the pipelines *have*
431 the functions listed in this class. Derivation from this class
432 is therefore merely a "courtesy" to maintainers.
444 #def setup(m, i): pass
451 class RecordBasedStage(Stage
):
452 """ convenience class which provides a Records-based layout.
453 honestly it's a lot easier just to create a direct Records-based
454 class (see ExampleAddRecordStage)
456 def __init__(self
, in_shape
, out_shape
, processfn
, setupfn
=None):
457 self
.in_shape
= in_shape
458 self
.out_shape
= out_shape
459 self
.__process
= processfn
460 self
.__setup
= setupfn
461 def ispec(self
): return Record(self
.in_shape
)
462 def ospec(self
): return Record(self
.out_shape
)
463 def process(seif
, i
): return self
.__process
(i
)
464 def setup(seif
, m
, i
): return self
.__setup
(m
, i
)
467 class StageChain(StageCls
):
468 """ pass in a list of stages, and they will automatically be
469 chained together via their input and output specs into a
472 the end result basically conforms to the exact same Stage API.
474 * input to this class will be the input of the first stage
475 * output of first stage goes into input of second
476 * output of second goes into input into third (etc. etc.)
477 * the output of this class will be the output of the last stage
479 def __init__(self
, chain
, specallocate
=False):
481 self
.specallocate
= specallocate
484 return _spec(self
.chain
[0].ispec
, "chainin")
487 return _spec(self
.chain
[-1].ospec
, "chainout")
489 def _specallocate_setup(self
, m
, i
):
490 for (idx
, c
) in enumerate(self
.chain
):
491 if hasattr(c
, "setup"):
492 c
.setup(m
, i
) # stage may have some module stuff
493 ofn
= self
.chain
[idx
].ospec
# last assignment survives
494 o
= _spec(ofn
, 'chainin%d' % idx
)
495 m
.d
.comb
+= eq(o
, c
.process(i
)) # process input into "o"
496 if idx
== len(self
.chain
)-1:
498 ifn
= self
.chain
[idx
+1].ispec
# new input on next loop
499 i
= _spec(ifn
, 'chainin%d' % (idx
+1))
500 m
.d
.comb
+= eq(i
, o
) # assign to next input
501 return o
# last loop is the output
503 def _noallocate_setup(self
, m
, i
):
504 for (idx
, c
) in enumerate(self
.chain
):
505 if hasattr(c
, "setup"):
506 c
.setup(m
, i
) # stage may have some module stuff
507 i
= o
= c
.process(i
) # store input into "o"
508 return o
# last loop is the output
510 def setup(self
, m
, i
):
511 if self
.specallocate
:
512 self
.o
= self
._specallocate
_setup
(m
, i
)
514 self
.o
= self
._noallocate
_setup
(m
, i
)
516 def process(self
, i
):
517 return self
.o
# conform to Stage API: return last-loop output
520 class ControlBase(Elaboratable
):
521 """ Common functions for Pipeline API
523 def __init__(self
, stage
=None, in_multi
=None, stage_ctl
=False):
524 """ Base class containing ready/valid/data to previous and next stages
526 * p: contains ready/valid to the previous stage
527 * n: contains ready/valid to the next stage
529 Except when calling Controlbase.connect(), user must also:
530 * add data_i member to PrevControl (p) and
531 * add data_o member to NextControl (n)
535 # set up input and output IO ACK (prev/next ready/valid)
536 self
.p
= PrevControl(in_multi
, stage_ctl
)
537 self
.n
= NextControl(stage_ctl
)
539 # set up the input and output data
540 if stage
is not None:
541 self
.p
.data_i
= _spec(stage
.ispec
, "data_i") # input type
542 self
.n
.data_o
= _spec(stage
.ospec
, "data_o") # output type
544 def connect_to_next(self
, nxt
):
545 """ helper function to connect to the next stage data/valid/ready.
547 return self
.n
.connect_to_next(nxt
.p
)
549 def _connect_in(self
, prev
):
550 """ internal helper function to connect stage to an input source.
551 do not use to connect stage-to-stage!
553 return self
.p
._connect
_in
(prev
.p
)
555 def _connect_out(self
, nxt
):
556 """ internal helper function to connect stage to an output source.
557 do not use to connect stage-to-stage!
559 return self
.n
._connect
_out
(nxt
.n
)
561 def connect(self
, pipechain
):
562 """ connects a chain (list) of Pipeline instances together and
563 links them to this ControlBase instance:
565 in <----> self <---> out
568 [pipe1, pipe2, pipe3, pipe4]
571 out---in out--in out---in
573 Also takes care of allocating data_i/data_o, by looking up
574 the data spec for each end of the pipechain. i.e It is NOT
575 necessary to allocate self.p.data_i or self.n.data_o manually:
576 this is handled AUTOMATICALLY, here.
578 Basically this function is the direct equivalent of StageChain,
579 except that unlike StageChain, the Pipeline logic is followed.
581 Just as StageChain presents an object that conforms to the
582 Stage API from a list of objects that also conform to the
583 Stage API, an object that calls this Pipeline connect function
584 has the exact same pipeline API as the list of pipline objects
587 Thus it becomes possible to build up larger chains recursively.
588 More complex chains (multi-input, multi-output) will have to be
591 eqs
= [] # collated list of assignment statements
593 # connect inter-chain
594 for i
in range(len(pipechain
)-1):
596 pipe2
= pipechain
[i
+1]
597 eqs
+= pipe1
.connect_to_next(pipe2
)
599 # connect front of chain to ourselves
601 self
.p
.data_i
= _spec(front
.stage
.ispec
, "chainin")
602 eqs
+= front
._connect
_in
(self
)
604 # connect end of chain to ourselves
606 self
.n
.data_o
= _spec(end
.stage
.ospec
, "chainout")
607 eqs
+= end
._connect
_out
(self
)
611 def _postprocess(self
, i
): # XXX DISABLED
612 return i
# RETURNS INPUT
613 if hasattr(self
.stage
, "postprocess"):
614 return self
.stage
.postprocess(i
)
617 def set_input(self
, i
):
618 """ helper function to set the input data
620 return eq(self
.p
.data_i
, i
)
629 def elaborate(self
, platform
):
630 """ handles case where stage has dynamic ready/valid functions
633 m
.submodules
.p
= self
.p
634 m
.submodules
.n
= self
.n
636 if self
.stage
is not None and hasattr(self
.stage
, "setup"):
637 self
.stage
.setup(m
, self
.p
.data_i
)
639 if not self
.p
.stage_ctl
:
642 # intercept the previous (outgoing) "ready", combine with stage ready
643 m
.d
.comb
+= self
.p
.s_ready_o
.eq(self
.p
._ready
_o
& self
.stage
.d_ready
)
645 # intercept the next (incoming) "ready" and combine it with data valid
646 sdv
= self
.stage
.d_valid(self
.n
.ready_i
)
647 m
.d
.comb
+= self
.n
.d_valid
.eq(self
.n
.ready_i
& sdv
)
652 class BufferedHandshake(ControlBase
):
653 """ buffered pipeline stage. data and strobe signals travel in sync.
654 if ever the input is ready and the output is not, processed data
655 is shunted in a temporary register.
657 Argument: stage. see Stage API above
659 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
660 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
661 stage-1 p.data_i >>in stage n.data_o out>> stage+1
667 input data p.data_i is read (only), is processed and goes into an
668 intermediate result store [process()]. this is updated combinatorially.
670 in a non-stall condition, the intermediate result will go into the
671 output (update_output). however if ever there is a stall, it goes
672 into r_data instead [update_buffer()].
674 when the non-stall condition is released, r_data is the first
675 to be transferred to the output [flush_buffer()], and the stall
678 on the next cycle (as long as stall is not raised again) the
679 input may begin to be processed and transferred directly to output.
682 def elaborate(self
, platform
):
683 self
.m
= ControlBase
.elaborate(self
, platform
)
685 result
= _spec(self
.stage
.ospec
, "r_tmp")
686 r_data
= _spec(self
.stage
.ospec
, "r_data")
688 # establish some combinatorial temporaries
689 o_n_validn
= Signal(reset_less
=True)
690 n_ready_i
= Signal(reset_less
=True, name
="n_i_rdy_data")
691 nir_por
= Signal(reset_less
=True)
692 nir_por_n
= Signal(reset_less
=True)
693 p_valid_i
= Signal(reset_less
=True)
694 nir_novn
= Signal(reset_less
=True)
695 nirn_novn
= Signal(reset_less
=True)
696 por_pivn
= Signal(reset_less
=True)
697 npnn
= Signal(reset_less
=True)
698 self
.m
.d
.comb
+= [p_valid_i
.eq(self
.p
.valid_i_test
),
699 o_n_validn
.eq(~self
.n
.valid_o
),
700 n_ready_i
.eq(self
.n
.ready_i_test
),
701 nir_por
.eq(n_ready_i
& self
.p
._ready
_o
),
702 nir_por_n
.eq(n_ready_i
& ~self
.p
._ready
_o
),
703 nir_novn
.eq(n_ready_i | o_n_validn
),
704 nirn_novn
.eq(~n_ready_i
& o_n_validn
),
705 npnn
.eq(nir_por | nirn_novn
),
706 por_pivn
.eq(self
.p
._ready
_o
& ~p_valid_i
)
709 # store result of processing in combinatorial temporary
710 self
.m
.d
.comb
+= eq(result
, self
.stage
.process(self
.p
.data_i
))
712 # if not in stall condition, update the temporary register
713 with self
.m
.If(self
.p
.ready_o
): # not stalled
714 self
.m
.d
.sync
+= eq(r_data
, result
) # update buffer
716 # data pass-through conditions
717 with self
.m
.If(npnn
):
718 data_o
= self
._postprocess
(result
)
719 self
.m
.d
.sync
+= [self
.n
.valid_o
.eq(p_valid_i
), # valid if p_valid
720 eq(self
.n
.data_o
, data_o
), # update output
722 # buffer flush conditions (NOTE: can override data passthru conditions)
723 with self
.m
.If(nir_por_n
): # not stalled
724 # Flush the [already processed] buffer to the output port.
725 data_o
= self
._postprocess
(r_data
)
726 self
.m
.d
.sync
+= [self
.n
.valid_o
.eq(1), # reg empty
727 eq(self
.n
.data_o
, data_o
), # flush buffer
729 # output ready conditions
730 self
.m
.d
.sync
+= self
.p
._ready
_o
.eq(nir_novn | por_pivn
)
735 class SimpleHandshake(ControlBase
):
736 """ simple handshake control. data and strobe signals travel in sync.
737 implements the protocol used by Wishbone and AXI4.
739 Argument: stage. see Stage API above
741 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
742 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
743 stage-1 p.data_i >>in stage n.data_o out>> stage+1
748 Inputs Temporary Output Data
749 ------- ---------- ----- ----
750 P P N N PiV& ~NiR& N P
757 0 0 1 0 0 0 0 1 process(data_i)
758 0 0 1 1 0 0 0 1 process(data_i)
762 0 1 1 0 0 0 0 1 process(data_i)
763 0 1 1 1 0 0 0 1 process(data_i)
767 1 0 1 0 0 0 0 1 process(data_i)
768 1 0 1 1 0 0 0 1 process(data_i)
770 1 1 0 0 1 0 1 0 process(data_i)
771 1 1 0 1 1 1 1 0 process(data_i)
772 1 1 1 0 1 0 1 1 process(data_i)
773 1 1 1 1 1 0 1 1 process(data_i)
777 def elaborate(self
, platform
):
778 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
781 result
= _spec(self
.stage
.ospec
, "r_tmp")
783 # establish some combinatorial temporaries
784 n_ready_i
= Signal(reset_less
=True, name
="n_i_rdy_data")
785 p_valid_i_p_ready_o
= Signal(reset_less
=True)
786 p_valid_i
= Signal(reset_less
=True)
787 m
.d
.comb
+= [p_valid_i
.eq(self
.p
.valid_i_test
),
788 n_ready_i
.eq(self
.n
.ready_i_test
),
789 p_valid_i_p_ready_o
.eq(p_valid_i
& self
.p
.ready_o
),
792 # store result of processing in combinatorial temporary
793 m
.d
.comb
+= eq(result
, self
.stage
.process(self
.p
.data_i
))
795 # previous valid and ready
796 with m
.If(p_valid_i_p_ready_o
):
797 data_o
= self
._postprocess
(result
)
798 m
.d
.sync
+= [r_busy
.eq(1), # output valid
799 eq(self
.n
.data_o
, data_o
), # update output
801 # previous invalid or not ready, however next is accepting
802 with m
.Elif(n_ready_i
):
803 data_o
= self
._postprocess
(result
)
804 m
.d
.sync
+= [eq(self
.n
.data_o
, data_o
)]
805 # TODO: could still send data here (if there was any)
806 #m.d.sync += self.n.valid_o.eq(0) # ...so set output invalid
807 m
.d
.sync
+= r_busy
.eq(0) # ...so set output invalid
809 m
.d
.comb
+= self
.n
.valid_o
.eq(r_busy
)
810 # if next is ready, so is previous
811 m
.d
.comb
+= self
.p
._ready
_o
.eq(n_ready_i
)
816 class UnbufferedPipeline(ControlBase
):
817 """ A simple pipeline stage with single-clock synchronisation
818 and two-way valid/ready synchronised signalling.
820 Note that a stall in one stage will result in the entire pipeline
823 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
824 travel synchronously with the data: the valid/ready signalling
825 combines in a *combinatorial* fashion. Therefore, a long pipeline
826 chain will lengthen propagation delays.
828 Argument: stage. see Stage API, above
830 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
831 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
832 stage-1 p.data_i >>in stage n.data_o out>> stage+1
840 p.data_i : StageInput, shaped according to ispec
842 p.data_o : StageOutput, shaped according to ospec
844 r_data : input_shape according to ispec
845 A temporary (buffered) copy of a prior (valid) input.
846 This is HELD if the output is not ready. It is updated
848 result: output_shape according to ospec
849 The output of the combinatorial logic. it is updated
850 COMBINATORIALLY (no clock dependence).
854 Inputs Temp Output Data
876 1 1 0 0 0 1 1 process(data_i)
877 1 1 0 1 1 1 0 process(data_i)
878 1 1 1 0 0 1 1 process(data_i)
879 1 1 1 1 0 1 1 process(data_i)
882 Note: PoR is *NOT* involved in the above decision-making.
885 def elaborate(self
, platform
):
886 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
888 data_valid
= Signal() # is data valid or not
889 r_data
= _spec(self
.stage
.ospec
, "r_tmp") # output type
892 p_valid_i
= Signal(reset_less
=True)
893 pv
= Signal(reset_less
=True)
894 buf_full
= Signal(reset_less
=True)
895 m
.d
.comb
+= p_valid_i
.eq(self
.p
.valid_i_test
)
896 m
.d
.comb
+= pv
.eq(self
.p
.valid_i
& self
.p
.ready_o
)
897 m
.d
.comb
+= buf_full
.eq(~self
.n
.ready_i_test
& data_valid
)
899 m
.d
.comb
+= self
.n
.valid_o
.eq(data_valid
)
900 m
.d
.comb
+= self
.p
._ready
_o
.eq(~data_valid | self
.n
.ready_i_test
)
901 m
.d
.sync
+= data_valid
.eq(p_valid_i | buf_full
)
904 m
.d
.sync
+= eq(r_data
, self
.stage
.process(self
.p
.data_i
))
905 data_o
= self
._postprocess
(r_data
)
906 m
.d
.comb
+= eq(self
.n
.data_o
, data_o
)
910 class UnbufferedPipeline2(ControlBase
):
911 """ A simple pipeline stage with single-clock synchronisation
912 and two-way valid/ready synchronised signalling.
914 Note that a stall in one stage will result in the entire pipeline
917 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
918 travel synchronously with the data: the valid/ready signalling
919 combines in a *combinatorial* fashion. Therefore, a long pipeline
920 chain will lengthen propagation delays.
922 Argument: stage. see Stage API, above
924 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
925 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
926 stage-1 p.data_i >>in stage n.data_o out>> stage+1
931 p.data_i : StageInput, shaped according to ispec
933 p.data_o : StageOutput, shaped according to ospec
935 buf : output_shape according to ospec
936 A temporary (buffered) copy of a valid output
937 This is HELD if the output is not ready. It is updated
940 Inputs Temp Output Data
942 P P N N ~NiR& N P (buf_full)
947 0 0 0 0 0 0 1 process(data_i)
948 0 0 0 1 1 1 0 reg (odata, unchanged)
949 0 0 1 0 0 0 1 process(data_i)
950 0 0 1 1 0 0 1 process(data_i)
952 0 1 0 0 0 0 1 process(data_i)
953 0 1 0 1 1 1 0 reg (odata, unchanged)
954 0 1 1 0 0 0 1 process(data_i)
955 0 1 1 1 0 0 1 process(data_i)
957 1 0 0 0 0 1 1 process(data_i)
958 1 0 0 1 1 1 0 reg (odata, unchanged)
959 1 0 1 0 0 1 1 process(data_i)
960 1 0 1 1 0 1 1 process(data_i)
962 1 1 0 0 0 1 1 process(data_i)
963 1 1 0 1 1 1 0 reg (odata, unchanged)
964 1 1 1 0 0 1 1 process(data_i)
965 1 1 1 1 0 1 1 process(data_i)
968 Note: PoR is *NOT* involved in the above decision-making.
971 def elaborate(self
, platform
):
972 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
974 buf_full
= Signal() # is data valid or not
975 buf
= _spec(self
.stage
.ospec
, "r_tmp") # output type
978 p_valid_i
= Signal(reset_less
=True)
979 m
.d
.comb
+= p_valid_i
.eq(self
.p
.valid_i_test
)
981 m
.d
.comb
+= self
.n
.valid_o
.eq(buf_full | p_valid_i
)
982 m
.d
.comb
+= self
.p
._ready
_o
.eq(~buf_full
)
983 m
.d
.sync
+= buf_full
.eq(~self
.n
.ready_i_test
& self
.n
.valid_o
)
985 data_o
= Mux(buf_full
, buf
, self
.stage
.process(self
.p
.data_i
))
986 data_o
= self
._postprocess
(data_o
)
987 m
.d
.comb
+= eq(self
.n
.data_o
, data_o
)
988 m
.d
.sync
+= eq(buf
, self
.n
.data_o
)
993 class PassThroughStage(StageCls
):
994 """ a pass-through stage which has its input data spec equal to its output,
995 and "passes through" its data from input to output.
997 def __init__(self
, iospecfn
):
998 self
.iospecfn
= iospecfn
999 def ispec(self
): return self
.iospecfn()
1000 def ospec(self
): return self
.iospecfn()
1001 def process(self
, i
): return i
1004 class PassThroughHandshake(ControlBase
):
1005 """ A control block that delays by one clock cycle.
1007 Inputs Temporary Output Data
1008 ------- ------------------ ----- ----
1009 P P N N PiV& PiV| NiR| pvr N P (pvr)
1010 i o i o PoR ~PoR ~NoV o o
1014 0 0 0 0 0 1 1 0 1 1 odata (unchanged)
1015 0 0 0 1 0 1 0 0 1 0 odata (unchanged)
1016 0 0 1 0 0 1 1 0 1 1 odata (unchanged)
1017 0 0 1 1 0 1 1 0 1 1 odata (unchanged)
1019 0 1 0 0 0 0 1 0 0 1 odata (unchanged)
1020 0 1 0 1 0 0 0 0 0 0 odata (unchanged)
1021 0 1 1 0 0 0 1 0 0 1 odata (unchanged)
1022 0 1 1 1 0 0 1 0 0 1 odata (unchanged)
1024 1 0 0 0 0 1 1 1 1 1 process(in)
1025 1 0 0 1 0 1 0 0 1 0 odata (unchanged)
1026 1 0 1 0 0 1 1 1 1 1 process(in)
1027 1 0 1 1 0 1 1 1 1 1 process(in)
1029 1 1 0 0 1 1 1 1 1 1 process(in)
1030 1 1 0 1 1 1 0 0 1 0 odata (unchanged)
1031 1 1 1 0 1 1 1 1 1 1 process(in)
1032 1 1 1 1 1 1 1 1 1 1 process(in)
1037 def elaborate(self
, platform
):
1038 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
1040 r_data
= _spec(self
.stage
.ospec
, "r_tmp") # output type
1043 p_valid_i
= Signal(reset_less
=True)
1044 pvr
= Signal(reset_less
=True)
1045 m
.d
.comb
+= p_valid_i
.eq(self
.p
.valid_i_test
)
1046 m
.d
.comb
+= pvr
.eq(p_valid_i
& self
.p
.ready_o
)
1048 m
.d
.comb
+= self
.p
.ready_o
.eq(~self
.n
.valid_o | self
.n
.ready_i_test
)
1049 m
.d
.sync
+= self
.n
.valid_o
.eq(p_valid_i | ~self
.p
.ready_o
)
1051 odata
= Mux(pvr
, self
.stage
.process(self
.p
.data_i
), r_data
)
1052 m
.d
.sync
+= eq(r_data
, odata
)
1053 r_data
= self
._postprocess
(r_data
)
1054 m
.d
.comb
+= eq(self
.n
.data_o
, r_data
)
1059 class RegisterPipeline(UnbufferedPipeline
):
1060 """ A pipeline stage that delays by one clock cycle, creating a
1061 sync'd latch out of data_o and valid_o as an indirect byproduct
1062 of using PassThroughStage
1064 def __init__(self
, iospecfn
):
1065 UnbufferedPipeline
.__init
__(self
, PassThroughStage(iospecfn
))
1068 class FIFOControl(ControlBase
):
1069 """ FIFO Control. Uses SyncFIFO to store data, coincidentally
1070 happens to have same valid/ready signalling as Stage API.
1072 data_i -> fifo.din -> FIFO -> fifo.dout -> data_o
1075 def __init__(self
, depth
, stage
, in_multi
=None, stage_ctl
=False,
1076 fwft
=True, buffered
=False, pipe
=False):
1079 * depth: number of entries in the FIFO
1080 * stage: data processing block
1081 * fwft : first word fall-thru mode (non-fwft introduces delay)
1082 * buffered: use buffered FIFO (introduces extra cycle delay)
1084 NOTE 1: FPGAs may have trouble with the defaults for SyncFIFO
1085 (fwft=True, buffered=False)
1087 NOTE 2: data_i *must* have a shape function. it can therefore
1088 be a Signal, or a Record, or a RecordObject.
1090 data is processed (and located) as follows:
1092 self.p self.stage temp fn temp fn temp fp self.n
1093 data_i->process()->result->cat->din.FIFO.dout->cat(data_o)
1095 yes, really: cat produces a Cat() which can be assigned to.
1096 this is how the FIFO gets de-catted without needing a de-cat
1100 assert not (fwft
and buffered
), "buffered cannot do fwft"
1104 self
.buffered
= buffered
1107 ControlBase
.__init
__(self
, stage
, in_multi
, stage_ctl
)
1109 def elaborate(self
, platform
):
1110 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
1112 # make a FIFO with a signal of equal width to the data_o.
1113 (fwidth
, _
) = shape(self
.n
.data_o
)
1115 fifo
= SyncFIFOBuffered(fwidth
, self
.fdepth
)
1117 fifo
= Queue(fwidth
, self
.fdepth
, fwft
=self
.fwft
, pipe
=self
.pipe
)
1118 m
.submodules
.fifo
= fifo
1120 # store result of processing in combinatorial temporary
1121 result
= _spec(self
.stage
.ospec
, "r_temp")
1122 m
.d
.comb
+= eq(result
, self
.stage
.process(self
.p
.data_i
))
1124 # connect previous rdy/valid/data - do cat on data_i
1125 # NOTE: cannot do the PrevControl-looking trick because
1126 # of need to process the data. shaaaame....
1127 m
.d
.comb
+= [fifo
.we
.eq(self
.p
.valid_i_test
),
1128 self
.p
.ready_o
.eq(fifo
.writable
),
1129 eq(fifo
.din
, cat(result
)),
1132 # connect next rdy/valid/data - do cat on data_o
1133 connections
= [self
.n
.valid_o
.eq(fifo
.readable
),
1134 fifo
.re
.eq(self
.n
.ready_i_test
),
1136 if self
.fwft
or self
.buffered
:
1137 m
.d
.comb
+= connections
1139 m
.d
.sync
+= connections
# unbuffered fwft mode needs sync
1140 data_o
= cat(self
.n
.data_o
).eq(fifo
.dout
)
1141 data_o
= self
._postprocess
(data_o
)
1148 class UnbufferedPipeline(FIFOControl
):
1149 def __init__(self
, stage
, in_multi
=None, stage_ctl
=False):
1150 FIFOControl
.__init
__(self
, 1, stage
, in_multi
, stage_ctl
,
1151 fwft
=True, pipe
=False)
1153 # aka "BreakReadyStage" XXX had to set fwft=True to get it to work
1154 class PassThroughHandshake(FIFOControl
):
1155 def __init__(self
, stage
, in_multi
=None, stage_ctl
=False):
1156 FIFOControl
.__init
__(self
, 1, stage
, in_multi
, stage_ctl
,
1157 fwft
=True, pipe
=True)
1159 # this is *probably* BufferedHandshake, although test #997 now succeeds.
1160 class BufferedHandshake(FIFOControl
):
1161 def __init__(self
, stage
, in_multi
=None, stage_ctl
=False):
1162 FIFOControl
.__init
__(self
, 2, stage
, in_multi
, stage_ctl
,
1163 fwft
=True, pipe
=False)
1167 # this is *probably* SimpleHandshake (note: memory cell size=0)
1168 class SimpleHandshake(FIFOControl):
1169 def __init__(self, stage, in_multi=None, stage_ctl=False):
1170 FIFOControl.__init__(self, 0, stage, in_multi, stage_ctl,
1171 fwft=True, pipe=False)