split stageapi into separate module, move ControlBase to singlepipe
[ieee754fpu.git] / src / add / singlepipe.py
1 """ Pipeline and BufferedHandshake implementation, conforming to the same API.
2 For multi-input and multi-output variants, see multipipe.
3
4 Associated development bugs:
5 * http://bugs.libre-riscv.org/show_bug.cgi?id=64
6 * http://bugs.libre-riscv.org/show_bug.cgi?id=57
7
8 Important: see Stage API (iocontrol.py) in combination with below
9
10 RecordBasedStage:
11 ----------------
12
13 A convenience class that takes an input shape, output shape, a
14 "processing" function and an optional "setup" function. Honestly
15 though, there's not much more effort to just... create a class
16 that returns a couple of Records (see ExampleAddRecordStage in
17 examples).
18
19 PassThroughStage:
20 ----------------
21
22 A convenience class that takes a single function as a parameter,
23 that is chain-called to create the exact same input and output spec.
24 It has a process() function that simply returns its input.
25
26 Instances of this class are completely redundant if handed to
27 StageChain, however when passed to UnbufferedPipeline they
28 can be used to introduce a single clock delay.
29
30 ControlBase:
31 -----------
32
33 The base class for pipelines. Contains previous and next ready/valid/data.
34 Also has an extremely useful "connect" function that can be used to
35 connect a chain of pipelines and present the exact same prev/next
36 ready/valid/data API.
37
38 Note: pipelines basically do not become pipelines as such until
39 handed to a derivative of ControlBase. ControlBase itself is *not*
40 strictly considered a pipeline class. Wishbone and AXI4 (master or
41 slave) could be derived from ControlBase, for example.
42 UnbufferedPipeline:
43 ------------------
44
45 A simple stalling clock-synchronised pipeline that has no buffering
46 (unlike BufferedHandshake). Data flows on *every* clock cycle when
47 the conditions are right (this is nominally when the input is valid
48 and the output is ready).
49
50 A stall anywhere along the line will result in a stall back-propagating
51 down the entire chain. The BufferedHandshake by contrast will buffer
52 incoming data, allowing previous stages one clock cycle's grace before
53 also having to stall.
54
55 An advantage of the UnbufferedPipeline over the Buffered one is
56 that the amount of logic needed (number of gates) is greatly
57 reduced (no second set of buffers basically)
58
59 The disadvantage of the UnbufferedPipeline is that the valid/ready
60 logic, if chained together, is *combinatorial*, resulting in
61 progressively larger gate delay.
62
63 PassThroughHandshake:
64 ------------------
65
66 A Control class that introduces a single clock delay, passing its
67 data through unaltered. Unlike RegisterPipeline (which relies
68 on UnbufferedPipeline and PassThroughStage) it handles ready/valid
69 itself.
70
71 RegisterPipeline:
72 ----------------
73
74 A convenience class that, because UnbufferedPipeline introduces a single
75 clock delay, when its stage is a PassThroughStage, it results in a Pipeline
76 stage that, duh, delays its (unmodified) input by one clock cycle.
77
78 BufferedHandshake:
79 ----------------
80
81 nmigen implementation of buffered pipeline stage, based on zipcpu:
82 https://zipcpu.com/blog/2017/08/14/strategies-for-pipelining.html
83
84 this module requires quite a bit of thought to understand how it works
85 (and why it is needed in the first place). reading the above is
86 *strongly* recommended.
87
88 unlike john dawson's IEEE754 FPU STB/ACK signalling, which requires
89 the STB / ACK signals to raise and lower (on separate clocks) before
90 data may proceeed (thus only allowing one piece of data to proceed
91 on *ALTERNATE* cycles), the signalling here is a true pipeline
92 where data will flow on *every* clock when the conditions are right.
93
94 input acceptance conditions are when:
95 * incoming previous-stage strobe (p.valid_i) is HIGH
96 * outgoing previous-stage ready (p.ready_o) is LOW
97
98 output transmission conditions are when:
99 * outgoing next-stage strobe (n.valid_o) is HIGH
100 * outgoing next-stage ready (n.ready_i) is LOW
101
102 the tricky bit is when the input has valid data and the output is not
103 ready to accept it. if it wasn't for the clock synchronisation, it
104 would be possible to tell the input "hey don't send that data, we're
105 not ready". unfortunately, it's not possible to "change the past":
106 the previous stage *has no choice* but to pass on its data.
107
108 therefore, the incoming data *must* be accepted - and stored: that
109 is the responsibility / contract that this stage *must* accept.
110 on the same clock, it's possible to tell the input that it must
111 not send any more data. this is the "stall" condition.
112
113 we now effectively have *two* possible pieces of data to "choose" from:
114 the buffered data, and the incoming data. the decision as to which
115 to process and output is based on whether we are in "stall" or not.
116 i.e. when the next stage is no longer ready, the output comes from
117 the buffer if a stall had previously occurred, otherwise it comes
118 direct from processing the input.
119
120 this allows us to respect a synchronous "travelling STB" with what
121 dan calls a "buffered handshake".
122
123 it's quite a complex state machine!
124
125 SimpleHandshake
126 ---------------
127
128 Synchronised pipeline, Based on:
129 https://github.com/ZipCPU/dbgbus/blob/master/hexbus/rtl/hbdeword.v
130 """
131
132 from nmigen import Signal, Cat, Const, Mux, Module, Value, Elaboratable
133 from nmigen.cli import verilog, rtlil
134 from nmigen.lib.fifo import SyncFIFO, SyncFIFOBuffered
135 from nmigen.hdl.ast import ArrayProxy
136 from nmigen.hdl.rec import Record
137
138 from abc import ABCMeta, abstractmethod
139 from collections.abc import Sequence, Iterable
140 from collections import OrderedDict
141 from queue import Queue
142 import inspect
143
144 import nmoperator
145 from iocontrol import (Object, RecordObject)
146 from stageapi import (_spec, PrevControl, NextControl, StageCls, Stage,
147 StageChain, StageHelper)
148
149
150
151 class ControlBase(StageHelper, Elaboratable):
152 """ Common functions for Pipeline API. Note: a "pipeline stage" only
153 exists (conceptually) when a ControlBase derivative is handed
154 a Stage (combinatorial block)
155
156 NOTE: ControlBase derives from StageHelper, making it accidentally
157 compliant with the Stage API. Using those functions directly
158 *BYPASSES* a ControlBase instance ready/valid signalling, which
159 clearly should not be done without a really, really good reason.
160 """
161 def __init__(self, stage=None, in_multi=None, stage_ctl=False):
162 """ Base class containing ready/valid/data to previous and next stages
163
164 * p: contains ready/valid to the previous stage
165 * n: contains ready/valid to the next stage
166
167 Except when calling Controlbase.connect(), user must also:
168 * add data_i member to PrevControl (p) and
169 * add data_o member to NextControl (n)
170 Calling ControlBase._new_data is a good way to do that.
171 """
172 StageHelper.__init__(self, stage)
173
174 # set up input and output IO ACK (prev/next ready/valid)
175 self.p = PrevControl(in_multi, stage_ctl)
176 self.n = NextControl(stage_ctl)
177
178 # set up the input and output data
179 if stage is not None:
180 self._new_data(self, self, "data")
181
182 def _new_data(self, p, n, name):
183 """ allocates new data_i and data_o
184 """
185 self.p.data_i = _spec(p.stage.ispec, "%s_i" % name)
186 self.n.data_o = _spec(n.stage.ospec, "%s_o" % name)
187
188 @property
189 def data_r(self):
190 return self.process(self.p.data_i)
191
192 def connect_to_next(self, nxt):
193 """ helper function to connect to the next stage data/valid/ready.
194 """
195 return self.n.connect_to_next(nxt.p)
196
197 def _connect_in(self, prev):
198 """ internal helper function to connect stage to an input source.
199 do not use to connect stage-to-stage!
200 """
201 return self.p._connect_in(prev.p)
202
203 def _connect_out(self, nxt):
204 """ internal helper function to connect stage to an output source.
205 do not use to connect stage-to-stage!
206 """
207 return self.n._connect_out(nxt.n)
208
209 def connect(self, pipechain):
210 """ connects a chain (list) of Pipeline instances together and
211 links them to this ControlBase instance:
212
213 in <----> self <---> out
214 | ^
215 v |
216 [pipe1, pipe2, pipe3, pipe4]
217 | ^ | ^ | ^
218 v | v | v |
219 out---in out--in out---in
220
221 Also takes care of allocating data_i/data_o, by looking up
222 the data spec for each end of the pipechain. i.e It is NOT
223 necessary to allocate self.p.data_i or self.n.data_o manually:
224 this is handled AUTOMATICALLY, here.
225
226 Basically this function is the direct equivalent of StageChain,
227 except that unlike StageChain, the Pipeline logic is followed.
228
229 Just as StageChain presents an object that conforms to the
230 Stage API from a list of objects that also conform to the
231 Stage API, an object that calls this Pipeline connect function
232 has the exact same pipeline API as the list of pipline objects
233 it is called with.
234
235 Thus it becomes possible to build up larger chains recursively.
236 More complex chains (multi-input, multi-output) will have to be
237 done manually.
238
239 Argument:
240
241 * :pipechain: - a sequence of ControlBase-derived classes
242 (must be one or more in length)
243
244 Returns:
245
246 * a list of eq assignments that will need to be added in
247 an elaborate() to m.d.comb
248 """
249 assert len(pipechain) > 0, "pipechain must be non-zero length"
250 eqs = [] # collated list of assignment statements
251
252 # connect inter-chain
253 for i in range(len(pipechain)-1):
254 pipe1 = pipechain[i] # earlier
255 pipe2 = pipechain[i+1] # later (by 1)
256 eqs += pipe1.connect_to_next(pipe2) # earlier n to later p
257
258 # connect front and back of chain to ourselves
259 front = pipechain[0] # first in chain
260 end = pipechain[-1] # last in chain
261 self._new_data(front, end, "chain") # NOTE: REPLACES existing data
262 eqs += front._connect_in(self) # front p to our p
263 eqs += end._connect_out(self) # end n to out n
264
265 return eqs
266
267 def set_input(self, i):
268 """ helper function to set the input data (used in unit tests)
269 """
270 return nmoperator.eq(self.p.data_i, i)
271
272 def __iter__(self):
273 yield from self.p # yields ready/valid/data (data also gets yielded)
274 yield from self.n # ditto
275
276 def ports(self):
277 return list(self)
278
279 def elaborate(self, platform):
280 """ handles case where stage has dynamic ready/valid functions
281 """
282 m = Module()
283 m.submodules.p = self.p
284 m.submodules.n = self.n
285
286 self.setup(m, self.p.data_i)
287
288 if not self.p.stage_ctl:
289 return m
290
291 # intercept the previous (outgoing) "ready", combine with stage ready
292 m.d.comb += self.p.s_ready_o.eq(self.p._ready_o & self.stage.d_ready)
293
294 # intercept the next (incoming) "ready" and combine it with data valid
295 sdv = self.stage.d_valid(self.n.ready_i)
296 m.d.comb += self.n.d_valid.eq(self.n.ready_i & sdv)
297
298 return m
299
300 class RecordBasedStage(Stage):
301 """ convenience class which provides a Records-based layout.
302 honestly it's a lot easier just to create a direct Records-based
303 class (see ExampleAddRecordStage)
304 """
305 def __init__(self, in_shape, out_shape, processfn, setupfn=None):
306 self.in_shape = in_shape
307 self.out_shape = out_shape
308 self.__process = processfn
309 self.__setup = setupfn
310 def ispec(self): return Record(self.in_shape)
311 def ospec(self): return Record(self.out_shape)
312 def process(seif, i): return self.__process(i)
313 def setup(seif, m, i): return self.__setup(m, i)
314
315
316 class BufferedHandshake(ControlBase):
317 """ buffered pipeline stage. data and strobe signals travel in sync.
318 if ever the input is ready and the output is not, processed data
319 is shunted in a temporary register.
320
321 Argument: stage. see Stage API above
322
323 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
324 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
325 stage-1 p.data_i >>in stage n.data_o out>> stage+1
326 | |
327 process --->----^
328 | |
329 +-- r_data ->-+
330
331 input data p.data_i is read (only), is processed and goes into an
332 intermediate result store [process()]. this is updated combinatorially.
333
334 in a non-stall condition, the intermediate result will go into the
335 output (update_output). however if ever there is a stall, it goes
336 into r_data instead [update_buffer()].
337
338 when the non-stall condition is released, r_data is the first
339 to be transferred to the output [flush_buffer()], and the stall
340 condition cleared.
341
342 on the next cycle (as long as stall is not raised again) the
343 input may begin to be processed and transferred directly to output.
344 """
345
346 def elaborate(self, platform):
347 self.m = ControlBase.elaborate(self, platform)
348
349 result = _spec(self.stage.ospec, "r_tmp")
350 r_data = _spec(self.stage.ospec, "r_data")
351
352 # establish some combinatorial temporaries
353 o_n_validn = Signal(reset_less=True)
354 n_ready_i = Signal(reset_less=True, name="n_i_rdy_data")
355 nir_por = Signal(reset_less=True)
356 nir_por_n = Signal(reset_less=True)
357 p_valid_i = Signal(reset_less=True)
358 nir_novn = Signal(reset_less=True)
359 nirn_novn = Signal(reset_less=True)
360 por_pivn = Signal(reset_less=True)
361 npnn = Signal(reset_less=True)
362 self.m.d.comb += [p_valid_i.eq(self.p.valid_i_test),
363 o_n_validn.eq(~self.n.valid_o),
364 n_ready_i.eq(self.n.ready_i_test),
365 nir_por.eq(n_ready_i & self.p._ready_o),
366 nir_por_n.eq(n_ready_i & ~self.p._ready_o),
367 nir_novn.eq(n_ready_i | o_n_validn),
368 nirn_novn.eq(~n_ready_i & o_n_validn),
369 npnn.eq(nir_por | nirn_novn),
370 por_pivn.eq(self.p._ready_o & ~p_valid_i)
371 ]
372
373 # store result of processing in combinatorial temporary
374 self.m.d.comb += nmoperator.eq(result, self.data_r)
375
376 # if not in stall condition, update the temporary register
377 with self.m.If(self.p.ready_o): # not stalled
378 self.m.d.sync += nmoperator.eq(r_data, result) # update buffer
379
380 # data pass-through conditions
381 with self.m.If(npnn):
382 data_o = self._postprocess(result) # XXX TBD, does nothing right now
383 self.m.d.sync += [self.n.valid_o.eq(p_valid_i), # valid if p_valid
384 nmoperator.eq(self.n.data_o, data_o), # update out
385 ]
386 # buffer flush conditions (NOTE: can override data passthru conditions)
387 with self.m.If(nir_por_n): # not stalled
388 # Flush the [already processed] buffer to the output port.
389 data_o = self._postprocess(r_data) # XXX TBD, does nothing right now
390 self.m.d.sync += [self.n.valid_o.eq(1), # reg empty
391 nmoperator.eq(self.n.data_o, data_o), # flush
392 ]
393 # output ready conditions
394 self.m.d.sync += self.p._ready_o.eq(nir_novn | por_pivn)
395
396 return self.m
397
398
399 class SimpleHandshake(ControlBase):
400 """ simple handshake control. data and strobe signals travel in sync.
401 implements the protocol used by Wishbone and AXI4.
402
403 Argument: stage. see Stage API above
404
405 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
406 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
407 stage-1 p.data_i >>in stage n.data_o out>> stage+1
408 | |
409 +--process->--^
410 Truth Table
411
412 Inputs Temporary Output Data
413 ------- ---------- ----- ----
414 P P N N PiV& ~NiR& N P
415 i o i o PoR NoV o o
416 V R R V V R
417
418 ------- - - - -
419 0 0 0 0 0 0 >0 0 reg
420 0 0 0 1 0 1 >1 0 reg
421 0 0 1 0 0 0 0 1 process(data_i)
422 0 0 1 1 0 0 0 1 process(data_i)
423 ------- - - - -
424 0 1 0 0 0 0 >0 0 reg
425 0 1 0 1 0 1 >1 0 reg
426 0 1 1 0 0 0 0 1 process(data_i)
427 0 1 1 1 0 0 0 1 process(data_i)
428 ------- - - - -
429 1 0 0 0 0 0 >0 0 reg
430 1 0 0 1 0 1 >1 0 reg
431 1 0 1 0 0 0 0 1 process(data_i)
432 1 0 1 1 0 0 0 1 process(data_i)
433 ------- - - - -
434 1 1 0 0 1 0 1 0 process(data_i)
435 1 1 0 1 1 1 1 0 process(data_i)
436 1 1 1 0 1 0 1 1 process(data_i)
437 1 1 1 1 1 0 1 1 process(data_i)
438 ------- - - - -
439 """
440
441 def elaborate(self, platform):
442 self.m = m = ControlBase.elaborate(self, platform)
443
444 r_busy = Signal()
445 result = _spec(self.stage.ospec, "r_tmp")
446
447 # establish some combinatorial temporaries
448 n_ready_i = Signal(reset_less=True, name="n_i_rdy_data")
449 p_valid_i_p_ready_o = Signal(reset_less=True)
450 p_valid_i = Signal(reset_less=True)
451 m.d.comb += [p_valid_i.eq(self.p.valid_i_test),
452 n_ready_i.eq(self.n.ready_i_test),
453 p_valid_i_p_ready_o.eq(p_valid_i & self.p.ready_o),
454 ]
455
456 # store result of processing in combinatorial temporary
457 m.d.comb += nmoperator.eq(result, self.data_r)
458
459 # previous valid and ready
460 with m.If(p_valid_i_p_ready_o):
461 data_o = self._postprocess(result) # XXX TBD, does nothing right now
462 m.d.sync += [r_busy.eq(1), # output valid
463 nmoperator.eq(self.n.data_o, data_o), # update output
464 ]
465 # previous invalid or not ready, however next is accepting
466 with m.Elif(n_ready_i):
467 data_o = self._postprocess(result) # XXX TBD, does nothing right now
468 m.d.sync += [nmoperator.eq(self.n.data_o, data_o)]
469 # TODO: could still send data here (if there was any)
470 #m.d.sync += self.n.valid_o.eq(0) # ...so set output invalid
471 m.d.sync += r_busy.eq(0) # ...so set output invalid
472
473 m.d.comb += self.n.valid_o.eq(r_busy)
474 # if next is ready, so is previous
475 m.d.comb += self.p._ready_o.eq(n_ready_i)
476
477 return self.m
478
479
480 class UnbufferedPipeline(ControlBase):
481 """ A simple pipeline stage with single-clock synchronisation
482 and two-way valid/ready synchronised signalling.
483
484 Note that a stall in one stage will result in the entire pipeline
485 chain stalling.
486
487 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
488 travel synchronously with the data: the valid/ready signalling
489 combines in a *combinatorial* fashion. Therefore, a long pipeline
490 chain will lengthen propagation delays.
491
492 Argument: stage. see Stage API, above
493
494 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
495 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
496 stage-1 p.data_i >>in stage n.data_o out>> stage+1
497 | |
498 r_data result
499 | |
500 +--process ->-+
501
502 Attributes:
503 -----------
504 p.data_i : StageInput, shaped according to ispec
505 The pipeline input
506 p.data_o : StageOutput, shaped according to ospec
507 The pipeline output
508 r_data : input_shape according to ispec
509 A temporary (buffered) copy of a prior (valid) input.
510 This is HELD if the output is not ready. It is updated
511 SYNCHRONOUSLY.
512 result: output_shape according to ospec
513 The output of the combinatorial logic. it is updated
514 COMBINATORIALLY (no clock dependence).
515
516 Truth Table
517
518 Inputs Temp Output Data
519 ------- - ----- ----
520 P P N N ~NiR& N P
521 i o i o NoV o o
522 V R R V V R
523
524 ------- - - -
525 0 0 0 0 0 0 1 reg
526 0 0 0 1 1 1 0 reg
527 0 0 1 0 0 0 1 reg
528 0 0 1 1 0 0 1 reg
529 ------- - - -
530 0 1 0 0 0 0 1 reg
531 0 1 0 1 1 1 0 reg
532 0 1 1 0 0 0 1 reg
533 0 1 1 1 0 0 1 reg
534 ------- - - -
535 1 0 0 0 0 1 1 reg
536 1 0 0 1 1 1 0 reg
537 1 0 1 0 0 1 1 reg
538 1 0 1 1 0 1 1 reg
539 ------- - - -
540 1 1 0 0 0 1 1 process(data_i)
541 1 1 0 1 1 1 0 process(data_i)
542 1 1 1 0 0 1 1 process(data_i)
543 1 1 1 1 0 1 1 process(data_i)
544 ------- - - -
545
546 Note: PoR is *NOT* involved in the above decision-making.
547 """
548
549 def elaborate(self, platform):
550 self.m = m = ControlBase.elaborate(self, platform)
551
552 data_valid = Signal() # is data valid or not
553 r_data = _spec(self.stage.ospec, "r_tmp") # output type
554
555 # some temporaries
556 p_valid_i = Signal(reset_less=True)
557 pv = Signal(reset_less=True)
558 buf_full = Signal(reset_less=True)
559 m.d.comb += p_valid_i.eq(self.p.valid_i_test)
560 m.d.comb += pv.eq(self.p.valid_i & self.p.ready_o)
561 m.d.comb += buf_full.eq(~self.n.ready_i_test & data_valid)
562
563 m.d.comb += self.n.valid_o.eq(data_valid)
564 m.d.comb += self.p._ready_o.eq(~data_valid | self.n.ready_i_test)
565 m.d.sync += data_valid.eq(p_valid_i | buf_full)
566
567 with m.If(pv):
568 m.d.sync += nmoperator.eq(r_data, self.data_r)
569 data_o = self._postprocess(r_data) # XXX TBD, does nothing right now
570 m.d.comb += nmoperator.eq(self.n.data_o, data_o)
571
572 return self.m
573
574 class UnbufferedPipeline2(ControlBase):
575 """ A simple pipeline stage with single-clock synchronisation
576 and two-way valid/ready synchronised signalling.
577
578 Note that a stall in one stage will result in the entire pipeline
579 chain stalling.
580
581 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
582 travel synchronously with the data: the valid/ready signalling
583 combines in a *combinatorial* fashion. Therefore, a long pipeline
584 chain will lengthen propagation delays.
585
586 Argument: stage. see Stage API, above
587
588 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
589 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
590 stage-1 p.data_i >>in stage n.data_o out>> stage+1
591 | | |
592 +- process-> buf <-+
593 Attributes:
594 -----------
595 p.data_i : StageInput, shaped according to ispec
596 The pipeline input
597 p.data_o : StageOutput, shaped according to ospec
598 The pipeline output
599 buf : output_shape according to ospec
600 A temporary (buffered) copy of a valid output
601 This is HELD if the output is not ready. It is updated
602 SYNCHRONOUSLY.
603
604 Inputs Temp Output Data
605 ------- - -----
606 P P N N ~NiR& N P (buf_full)
607 i o i o NoV o o
608 V R R V V R
609
610 ------- - - -
611 0 0 0 0 0 0 1 process(data_i)
612 0 0 0 1 1 1 0 reg (odata, unchanged)
613 0 0 1 0 0 0 1 process(data_i)
614 0 0 1 1 0 0 1 process(data_i)
615 ------- - - -
616 0 1 0 0 0 0 1 process(data_i)
617 0 1 0 1 1 1 0 reg (odata, unchanged)
618 0 1 1 0 0 0 1 process(data_i)
619 0 1 1 1 0 0 1 process(data_i)
620 ------- - - -
621 1 0 0 0 0 1 1 process(data_i)
622 1 0 0 1 1 1 0 reg (odata, unchanged)
623 1 0 1 0 0 1 1 process(data_i)
624 1 0 1 1 0 1 1 process(data_i)
625 ------- - - -
626 1 1 0 0 0 1 1 process(data_i)
627 1 1 0 1 1 1 0 reg (odata, unchanged)
628 1 1 1 0 0 1 1 process(data_i)
629 1 1 1 1 0 1 1 process(data_i)
630 ------- - - -
631
632 Note: PoR is *NOT* involved in the above decision-making.
633 """
634
635 def elaborate(self, platform):
636 self.m = m = ControlBase.elaborate(self, platform)
637
638 buf_full = Signal() # is data valid or not
639 buf = _spec(self.stage.ospec, "r_tmp") # output type
640
641 # some temporaries
642 p_valid_i = Signal(reset_less=True)
643 m.d.comb += p_valid_i.eq(self.p.valid_i_test)
644
645 m.d.comb += self.n.valid_o.eq(buf_full | p_valid_i)
646 m.d.comb += self.p._ready_o.eq(~buf_full)
647 m.d.sync += buf_full.eq(~self.n.ready_i_test & self.n.valid_o)
648
649 data_o = Mux(buf_full, buf, self.data_r)
650 data_o = self._postprocess(data_o) # XXX TBD, does nothing right now
651 m.d.comb += nmoperator.eq(self.n.data_o, data_o)
652 m.d.sync += nmoperator.eq(buf, self.n.data_o)
653
654 return self.m
655
656
657 class PassThroughStage(StageCls):
658 """ a pass-through stage with its input data spec identical to its output,
659 and "passes through" its data from input to output (does nothing).
660
661 use this basically to explicitly make any data spec Stage-compliant.
662 (many APIs would potentially use a static "wrap" method in e.g.
663 StageCls to achieve a similar effect)
664 """
665 def __init__(self, iospecfn): self.iospecfn = iospecfn
666 def ispec(self): return self.iospecfn()
667 def ospec(self): return self.iospecfn()
668
669
670 class PassThroughHandshake(ControlBase):
671 """ A control block that delays by one clock cycle.
672
673 Inputs Temporary Output Data
674 ------- ------------------ ----- ----
675 P P N N PiV& PiV| NiR| pvr N P (pvr)
676 i o i o PoR ~PoR ~NoV o o
677 V R R V V R
678
679 ------- - - - - - -
680 0 0 0 0 0 1 1 0 1 1 odata (unchanged)
681 0 0 0 1 0 1 0 0 1 0 odata (unchanged)
682 0 0 1 0 0 1 1 0 1 1 odata (unchanged)
683 0 0 1 1 0 1 1 0 1 1 odata (unchanged)
684 ------- - - - - - -
685 0 1 0 0 0 0 1 0 0 1 odata (unchanged)
686 0 1 0 1 0 0 0 0 0 0 odata (unchanged)
687 0 1 1 0 0 0 1 0 0 1 odata (unchanged)
688 0 1 1 1 0 0 1 0 0 1 odata (unchanged)
689 ------- - - - - - -
690 1 0 0 0 0 1 1 1 1 1 process(in)
691 1 0 0 1 0 1 0 0 1 0 odata (unchanged)
692 1 0 1 0 0 1 1 1 1 1 process(in)
693 1 0 1 1 0 1 1 1 1 1 process(in)
694 ------- - - - - - -
695 1 1 0 0 1 1 1 1 1 1 process(in)
696 1 1 0 1 1 1 0 0 1 0 odata (unchanged)
697 1 1 1 0 1 1 1 1 1 1 process(in)
698 1 1 1 1 1 1 1 1 1 1 process(in)
699 ------- - - - - - -
700
701 """
702
703 def elaborate(self, platform):
704 self.m = m = ControlBase.elaborate(self, platform)
705
706 r_data = _spec(self.stage.ospec, "r_tmp") # output type
707
708 # temporaries
709 p_valid_i = Signal(reset_less=True)
710 pvr = Signal(reset_less=True)
711 m.d.comb += p_valid_i.eq(self.p.valid_i_test)
712 m.d.comb += pvr.eq(p_valid_i & self.p.ready_o)
713
714 m.d.comb += self.p.ready_o.eq(~self.n.valid_o | self.n.ready_i_test)
715 m.d.sync += self.n.valid_o.eq(p_valid_i | ~self.p.ready_o)
716
717 odata = Mux(pvr, self.data_r, r_data)
718 m.d.sync += nmoperator.eq(r_data, odata)
719 r_data = self._postprocess(r_data) # XXX TBD, does nothing right now
720 m.d.comb += nmoperator.eq(self.n.data_o, r_data)
721
722 return m
723
724
725 class RegisterPipeline(UnbufferedPipeline):
726 """ A pipeline stage that delays by one clock cycle, creating a
727 sync'd latch out of data_o and valid_o as an indirect byproduct
728 of using PassThroughStage
729 """
730 def __init__(self, iospecfn):
731 UnbufferedPipeline.__init__(self, PassThroughStage(iospecfn))
732
733
734 class FIFOControl(ControlBase):
735 """ FIFO Control. Uses SyncFIFO to store data, coincidentally
736 happens to have same valid/ready signalling as Stage API.
737
738 data_i -> fifo.din -> FIFO -> fifo.dout -> data_o
739 """
740 def __init__(self, depth, stage, in_multi=None, stage_ctl=False,
741 fwft=True, buffered=False, pipe=False):
742 """ FIFO Control
743
744 * :depth: number of entries in the FIFO
745 * :stage: data processing block
746 * :fwft: first word fall-thru mode (non-fwft introduces delay)
747 * :buffered: use buffered FIFO (introduces extra cycle delay)
748
749 NOTE 1: FPGAs may have trouble with the defaults for SyncFIFO
750 (fwft=True, buffered=False). XXX TODO: fix this by
751 using Queue in all cases instead.
752
753 data is processed (and located) as follows:
754
755 self.p self.stage temp fn temp fn temp fp self.n
756 data_i->process()->result->cat->din.FIFO.dout->cat(data_o)
757
758 yes, really: cat produces a Cat() which can be assigned to.
759 this is how the FIFO gets de-catted without needing a de-cat
760 function
761 """
762
763 assert not (fwft and buffered), "buffered cannot do fwft"
764 if buffered:
765 depth += 1
766 self.fwft = fwft
767 self.buffered = buffered
768 self.pipe = pipe
769 self.fdepth = depth
770 ControlBase.__init__(self, stage, in_multi, stage_ctl)
771
772 def elaborate(self, platform):
773 self.m = m = ControlBase.elaborate(self, platform)
774
775 # make a FIFO with a signal of equal width to the data_o.
776 (fwidth, _) = nmoperator.shape(self.n.data_o)
777 if self.buffered:
778 fifo = SyncFIFOBuffered(fwidth, self.fdepth)
779 else:
780 fifo = Queue(fwidth, self.fdepth, fwft=self.fwft, pipe=self.pipe)
781 m.submodules.fifo = fifo
782
783 # store result of processing in combinatorial temporary
784 result = _spec(self.stage.ospec, "r_temp")
785 m.d.comb += nmoperator.eq(result, self.data_r)
786
787 # connect previous rdy/valid/data - do cat on data_i
788 # NOTE: cannot do the PrevControl-looking trick because
789 # of need to process the data. shaaaame....
790 m.d.comb += [fifo.we.eq(self.p.valid_i_test),
791 self.p.ready_o.eq(fifo.writable),
792 nmoperator.eq(fifo.din, nmoperator.cat(result)),
793 ]
794
795 # connect next rdy/valid/data - do cat on data_o (further below)
796 connections = [self.n.valid_o.eq(fifo.readable),
797 fifo.re.eq(self.n.ready_i_test),
798 ]
799 if self.fwft or self.buffered:
800 m.d.comb += connections # combinatorial on next ready/valid
801 else:
802 m.d.sync += connections # unbuffered fwft mode needs sync
803 data_o = nmoperator.cat(self.n.data_o).eq(fifo.dout)
804 data_o = self._postprocess(data_o) # XXX TBD, does nothing right now
805 m.d.comb += data_o
806
807 return m
808
809
810 # aka "RegStage".
811 class UnbufferedPipeline(FIFOControl):
812 def __init__(self, stage, in_multi=None, stage_ctl=False):
813 FIFOControl.__init__(self, 1, stage, in_multi, stage_ctl,
814 fwft=True, pipe=False)
815
816 # aka "BreakReadyStage" XXX had to set fwft=True to get it to work
817 class PassThroughHandshake(FIFOControl):
818 def __init__(self, stage, in_multi=None, stage_ctl=False):
819 FIFOControl.__init__(self, 1, stage, in_multi, stage_ctl,
820 fwft=True, pipe=True)
821
822 # this is *probably* BufferedHandshake, although test #997 now succeeds.
823 class BufferedHandshake(FIFOControl):
824 def __init__(self, stage, in_multi=None, stage_ctl=False):
825 FIFOControl.__init__(self, 2, stage, in_multi, stage_ctl,
826 fwft=True, pipe=False)
827
828
829 """
830 # this is *probably* SimpleHandshake (note: memory cell size=0)
831 class SimpleHandshake(FIFOControl):
832 def __init__(self, stage, in_multi=None, stage_ctl=False):
833 FIFOControl.__init__(self, 0, stage, in_multi, stage_ctl,
834 fwft=True, pipe=False)
835 """