replace flatten with iterator
[ieee754fpu.git] / src / add / singlepipe.py
1 """ Pipeline and BufferedHandshake implementation, conforming to the same API.
2 For multi-input and multi-output variants, see multipipe.
3
4 Associated development bugs:
5 * http://bugs.libre-riscv.org/show_bug.cgi?id=64
6 * http://bugs.libre-riscv.org/show_bug.cgi?id=57
7
8 eq:
9 --
10
11 a strategically very important function that is identical in function
12 to nmigen's Signal.eq function, except it may take objects, or a list
13 of objects, or a tuple of objects, and where objects may also be
14 Records.
15
16 Stage API:
17 ---------
18
19 stage requires compliance with a strict API that may be
20 implemented in several means, including as a static class.
21 the methods of a stage instance must be as follows:
22
23 * ispec() - Input data format specification
24 returns an object or a list or tuple of objects, or
25 a Record, each object having an "eq" function which
26 takes responsibility for copying by assignment all
27 sub-objects
28 * ospec() - Output data format specification
29 requirements as for ospec
30 * process(m, i) - Processes an ispec-formatted object
31 returns a combinatorial block of a result that
32 may be assigned to the output, by way of the "eq"
33 function
34 * setup(m, i) - Optional function for setting up submodules
35 may be used for more complex stages, to link
36 the input (i) to submodules. must take responsibility
37 for adding those submodules to the module (m).
38 the submodules must be combinatorial blocks and
39 must have their inputs and output linked combinatorially.
40
41 Both StageCls (for use with non-static classes) and Stage (for use
42 by static classes) are abstract classes from which, for convenience
43 and as a courtesy to other developers, anything conforming to the
44 Stage API may *choose* to derive.
45
46 StageChain:
47 ----------
48
49 A useful combinatorial wrapper around stages that chains them together
50 and then presents a Stage-API-conformant interface. By presenting
51 the same API as the stages it wraps, it can clearly be used recursively.
52
53 RecordBasedStage:
54 ----------------
55
56 A convenience class that takes an input shape, output shape, a
57 "processing" function and an optional "setup" function. Honestly
58 though, there's not much more effort to just... create a class
59 that returns a couple of Records (see ExampleAddRecordStage in
60 examples).
61
62 PassThroughStage:
63 ----------------
64
65 A convenience class that takes a single function as a parameter,
66 that is chain-called to create the exact same input and output spec.
67 It has a process() function that simply returns its input.
68
69 Instances of this class are completely redundant if handed to
70 StageChain, however when passed to UnbufferedPipeline they
71 can be used to introduce a single clock delay.
72
73 ControlBase:
74 -----------
75
76 The base class for pipelines. Contains previous and next ready/valid/data.
77 Also has an extremely useful "connect" function that can be used to
78 connect a chain of pipelines and present the exact same prev/next
79 ready/valid/data API.
80
81 UnbufferedPipeline:
82 ------------------
83
84 A simple stalling clock-synchronised pipeline that has no buffering
85 (unlike BufferedHandshake). Data flows on *every* clock cycle when
86 the conditions are right (this is nominally when the input is valid
87 and the output is ready).
88
89 A stall anywhere along the line will result in a stall back-propagating
90 down the entire chain. The BufferedHandshake by contrast will buffer
91 incoming data, allowing previous stages one clock cycle's grace before
92 also having to stall.
93
94 An advantage of the UnbufferedPipeline over the Buffered one is
95 that the amount of logic needed (number of gates) is greatly
96 reduced (no second set of buffers basically)
97
98 The disadvantage of the UnbufferedPipeline is that the valid/ready
99 logic, if chained together, is *combinatorial*, resulting in
100 progressively larger gate delay.
101
102 PassThroughHandshake:
103 ------------------
104
105 A Control class that introduces a single clock delay, passing its
106 data through unaltered. Unlike RegisterPipeline (which relies
107 on UnbufferedPipeline and PassThroughStage) it handles ready/valid
108 itself.
109
110 RegisterPipeline:
111 ----------------
112
113 A convenience class that, because UnbufferedPipeline introduces a single
114 clock delay, when its stage is a PassThroughStage, it results in a Pipeline
115 stage that, duh, delays its (unmodified) input by one clock cycle.
116
117 BufferedHandshake:
118 ----------------
119
120 nmigen implementation of buffered pipeline stage, based on zipcpu:
121 https://zipcpu.com/blog/2017/08/14/strategies-for-pipelining.html
122
123 this module requires quite a bit of thought to understand how it works
124 (and why it is needed in the first place). reading the above is
125 *strongly* recommended.
126
127 unlike john dawson's IEEE754 FPU STB/ACK signalling, which requires
128 the STB / ACK signals to raise and lower (on separate clocks) before
129 data may proceeed (thus only allowing one piece of data to proceed
130 on *ALTERNATE* cycles), the signalling here is a true pipeline
131 where data will flow on *every* clock when the conditions are right.
132
133 input acceptance conditions are when:
134 * incoming previous-stage strobe (p.i_valid) is HIGH
135 * outgoing previous-stage ready (p.o_ready) is LOW
136
137 output transmission conditions are when:
138 * outgoing next-stage strobe (n.o_valid) is HIGH
139 * outgoing next-stage ready (n.i_ready) is LOW
140
141 the tricky bit is when the input has valid data and the output is not
142 ready to accept it. if it wasn't for the clock synchronisation, it
143 would be possible to tell the input "hey don't send that data, we're
144 not ready". unfortunately, it's not possible to "change the past":
145 the previous stage *has no choice* but to pass on its data.
146
147 therefore, the incoming data *must* be accepted - and stored: that
148 is the responsibility / contract that this stage *must* accept.
149 on the same clock, it's possible to tell the input that it must
150 not send any more data. this is the "stall" condition.
151
152 we now effectively have *two* possible pieces of data to "choose" from:
153 the buffered data, and the incoming data. the decision as to which
154 to process and output is based on whether we are in "stall" or not.
155 i.e. when the next stage is no longer ready, the output comes from
156 the buffer if a stall had previously occurred, otherwise it comes
157 direct from processing the input.
158
159 this allows us to respect a synchronous "travelling STB" with what
160 dan calls a "buffered handshake".
161
162 it's quite a complex state machine!
163
164 SimpleHandshake
165 ---------------
166
167 Synchronised pipeline, Based on:
168 https://github.com/ZipCPU/dbgbus/blob/master/hexbus/rtl/hbdeword.v
169 """
170
171 from nmigen import Signal, Cat, Const, Mux, Module, Value
172 from nmigen.cli import verilog, rtlil
173 from nmigen.lib.fifo import SyncFIFO, SyncFIFOBuffered
174 from nmigen.hdl.ast import ArrayProxy
175 from nmigen.hdl.rec import Record, Layout
176
177 from abc import ABCMeta, abstractmethod
178 from collections.abc import Sequence
179 from queue import Queue
180
181
182 class RecordObject(Record):
183 def __init__(self, layout=None, name=None):
184 Record.__init__(self, layout=layout or [], name=None)
185
186 def __setattr__(self, k, v):
187 if k in dir(Record) or "fields" not in self.__dict__:
188 return object.__setattr__(self, k, v)
189 self.fields[k] = v
190 if isinstance(v, Record):
191 newlayout = {k: (k, v.layout)}
192 else:
193 newlayout = {k: (k, v.shape())}
194 self.layout.fields.update(newlayout)
195
196
197
198 class PrevControl:
199 """ contains signals that come *from* the previous stage (both in and out)
200 * i_valid: previous stage indicating all incoming data is valid.
201 may be a multi-bit signal, where all bits are required
202 to be asserted to indicate "valid".
203 * o_ready: output to next stage indicating readiness to accept data
204 * i_data : an input - added by the user of this class
205 """
206
207 def __init__(self, i_width=1, stage_ctl=False):
208 self.stage_ctl = stage_ctl
209 self.i_valid = Signal(i_width, name="p_i_valid") # prev >>in self
210 self._o_ready = Signal(name="p_o_ready") # prev <<out self
211 self.i_data = None # XXX MUST BE ADDED BY USER
212 if stage_ctl:
213 self.s_o_ready = Signal(name="p_s_o_rdy") # prev <<out self
214
215 @property
216 def o_ready(self):
217 """ public-facing API: indicates (externally) that stage is ready
218 """
219 if self.stage_ctl:
220 return self.s_o_ready # set dynamically by stage
221 return self._o_ready # return this when not under dynamic control
222
223 def _connect_in(self, prev, direct=False, fn=None):
224 """ internal helper function to connect stage to an input source.
225 do not use to connect stage-to-stage!
226 """
227 i_valid = prev.i_valid if direct else prev.i_valid_test
228 i_data = fn(prev.i_data) if fn is not None else prev.i_data
229 return [self.i_valid.eq(i_valid),
230 prev.o_ready.eq(self.o_ready),
231 eq(self.i_data, i_data),
232 ]
233
234 @property
235 def i_valid_test(self):
236 vlen = len(self.i_valid)
237 if vlen > 1:
238 # multi-bit case: valid only when i_valid is all 1s
239 all1s = Const(-1, (len(self.i_valid), False))
240 i_valid = (self.i_valid == all1s)
241 else:
242 # single-bit i_valid case
243 i_valid = self.i_valid
244
245 # when stage indicates not ready, incoming data
246 # must "appear" to be not ready too
247 if self.stage_ctl:
248 i_valid = i_valid & self.s_o_ready
249
250 return i_valid
251
252
253 class NextControl:
254 """ contains the signals that go *to* the next stage (both in and out)
255 * o_valid: output indicating to next stage that data is valid
256 * i_ready: input from next stage indicating that it can accept data
257 * o_data : an output - added by the user of this class
258 """
259 def __init__(self, stage_ctl=False):
260 self.stage_ctl = stage_ctl
261 self.o_valid = Signal(name="n_o_valid") # self out>> next
262 self.i_ready = Signal(name="n_i_ready") # self <<in next
263 self.o_data = None # XXX MUST BE ADDED BY USER
264 #if self.stage_ctl:
265 self.d_valid = Signal(reset=1) # INTERNAL (data valid)
266
267 @property
268 def i_ready_test(self):
269 if self.stage_ctl:
270 return self.i_ready & self.d_valid
271 return self.i_ready
272
273 def connect_to_next(self, nxt):
274 """ helper function to connect to the next stage data/valid/ready.
275 data/valid is passed *TO* nxt, and ready comes *IN* from nxt.
276 use this when connecting stage-to-stage
277 """
278 return [nxt.i_valid.eq(self.o_valid),
279 self.i_ready.eq(nxt.o_ready),
280 eq(nxt.i_data, self.o_data),
281 ]
282
283 def _connect_out(self, nxt, direct=False, fn=None):
284 """ internal helper function to connect stage to an output source.
285 do not use to connect stage-to-stage!
286 """
287 i_ready = nxt.i_ready if direct else nxt.i_ready_test
288 o_data = fn(nxt.o_data) if fn is not None else nxt.o_data
289 return [nxt.o_valid.eq(self.o_valid),
290 self.i_ready.eq(i_ready),
291 eq(o_data, self.o_data),
292 ]
293
294
295 class Visitor:
296 """ a helper routine which identifies if it is being passed a list
297 (or tuple) of objects, or signals, or Records, and calls
298 a visitor function.
299
300 the visiting fn is called when an object is identified.
301
302 Record is a special (unusual, recursive) case, where the input may be
303 specified as a dictionary (which may contain further dictionaries,
304 recursively), where the field names of the dictionary must match
305 the Record's field spec. Alternatively, an object with the same
306 member names as the Record may be assigned: it does not have to
307 *be* a Record.
308
309 ArrayProxy is also special-cased, it's a bit messy: whilst ArrayProxy
310 has an eq function, the object being assigned to it (e.g. a python
311 object) might not. despite the *input* having an eq function,
312 that doesn't help us, because it's the *ArrayProxy* that's being
313 assigned to. so.... we cheat. use the ports() function of the
314 python object, enumerate them, find out the list of Signals that way,
315 and assign them.
316 """
317 def visit(self, o, i, act):
318 if isinstance(o, dict):
319 return self.dict_visit(o, i, act)
320
321 res = act.prepare()
322 if not isinstance(o, Sequence):
323 o, i = [o], [i]
324 for (ao, ai) in zip(o, i):
325 #print ("visit", fn, ao, ai)
326 if isinstance(ao, Record):
327 rres = self.record_visit(ao, ai, act)
328 elif isinstance(ao, ArrayProxy) and not isinstance(ai, Value):
329 rres = self.arrayproxy_visit(ao, ai, act)
330 else:
331 rres = act.fn(ao, ai)
332 res += rres
333 return res
334
335 def dict_visit(self, o, i, act):
336 res = act.prepare()
337 for (k, v) in o.items():
338 print ("d-eq", v, i[k])
339 res.append(act.fn(v, i[k]))
340 return res
341
342 def record_visit(self, ao, ai, act):
343 res = act.prepare()
344 for idx, (field_name, field_shape, _) in enumerate(ao.layout):
345 if isinstance(field_shape, Layout):
346 val = ai.fields
347 else:
348 val = ai
349 if hasattr(val, field_name): # check for attribute
350 val = getattr(val, field_name)
351 else:
352 val = val[field_name] # dictionary-style specification
353 val = self.visit(ao.fields[field_name], val, act)
354 if isinstance(val, Sequence):
355 res += val
356 else:
357 res.append(val)
358 return res
359
360 def arrayproxy_visit(self, ao, ai, act):
361 res = act.prepare()
362 for p in ai.ports():
363 op = getattr(ao, p.name)
364 #print (op, p, p.name)
365 res.append(act.fn(op, p))
366 return res
367
368
369 class Eq(Visitor):
370 def __init__(self):
371 self.res = []
372 def prepare(self):
373 return []
374 def fn(self, o, i):
375 rres = o.eq(i)
376 if not isinstance(rres, Sequence):
377 rres = [rres]
378 return rres
379 def __call__(self, o, i):
380 return self.visit(o, i, self)
381
382
383 def eq(o, i):
384 """ makes signals equal: a helper routine which identifies if it is being
385 passed a list (or tuple) of objects, or signals, or Records, and calls
386 the objects' eq function.
387 """
388 return Eq()(o, i)
389
390
391 def iterate(i):
392 """ iterate a compound structure recursively and yield data
393 """
394 if not isinstance(i, Sequence):
395 i = [i]
396 for ai in i:
397 print ("iterate", ai)
398 if isinstance(ai, Record):
399 print ("record", list(ai.layout))
400 rres = []
401 for idx, (field_name, field_shape, _) in enumerate(ai.layout):
402 if isinstance(field_shape, Layout):
403 val = ai.fields
404 else:
405 val = ai
406 if hasattr(val, field_name): # check for attribute
407 val = getattr(val, field_name)
408 else:
409 val = val[field_name] # dictionary-style specification
410 print ("recidx", idx, field_name, field_shape, val)
411 yield from flatten(val)
412
413 elif isinstance(ai, ArrayProxy) and not isinstance(ai, Value):
414 for p in ai.ports():
415 yield from iterate(p)
416 else:
417 yield ai
418
419
420 def flatten(i):
421 """ flattens a compound structure recursively using Cat
422 """
423 res = list(iterate(i))
424 return Cat(*res)
425
426
427 class StageCls(metaclass=ABCMeta):
428 """ Class-based "Stage" API. requires instantiation (after derivation)
429
430 see "Stage API" above.. Note: python does *not* require derivation
431 from this class. All that is required is that the pipelines *have*
432 the functions listed in this class. Derivation from this class
433 is therefore merely a "courtesy" to maintainers.
434 """
435 @abstractmethod
436 def ispec(self): pass # REQUIRED
437 @abstractmethod
438 def ospec(self): pass # REQUIRED
439 #@abstractmethod
440 #def setup(self, m, i): pass # OPTIONAL
441 @abstractmethod
442 def process(self, i): pass # REQUIRED
443
444
445 class Stage(metaclass=ABCMeta):
446 """ Static "Stage" API. does not require instantiation (after derivation)
447
448 see "Stage API" above. Note: python does *not* require derivation
449 from this class. All that is required is that the pipelines *have*
450 the functions listed in this class. Derivation from this class
451 is therefore merely a "courtesy" to maintainers.
452 """
453 @staticmethod
454 @abstractmethod
455 def ispec(): pass
456
457 @staticmethod
458 @abstractmethod
459 def ospec(): pass
460
461 #@staticmethod
462 #@abstractmethod
463 #def setup(m, i): pass
464
465 @staticmethod
466 @abstractmethod
467 def process(i): pass
468
469
470 class RecordBasedStage(Stage):
471 """ convenience class which provides a Records-based layout.
472 honestly it's a lot easier just to create a direct Records-based
473 class (see ExampleAddRecordStage)
474 """
475 def __init__(self, in_shape, out_shape, processfn, setupfn=None):
476 self.in_shape = in_shape
477 self.out_shape = out_shape
478 self.__process = processfn
479 self.__setup = setupfn
480 def ispec(self): return Record(self.in_shape)
481 def ospec(self): return Record(self.out_shape)
482 def process(seif, i): return self.__process(i)
483 def setup(seif, m, i): return self.__setup(m, i)
484
485
486 class StageChain(StageCls):
487 """ pass in a list of stages, and they will automatically be
488 chained together via their input and output specs into a
489 combinatorial chain.
490
491 the end result basically conforms to the exact same Stage API.
492
493 * input to this class will be the input of the first stage
494 * output of first stage goes into input of second
495 * output of second goes into input into third (etc. etc.)
496 * the output of this class will be the output of the last stage
497 """
498 def __init__(self, chain, specallocate=False):
499 self.chain = chain
500 self.specallocate = specallocate
501
502 def ispec(self):
503 return self.chain[0].ispec()
504
505 def ospec(self):
506 return self.chain[-1].ospec()
507
508 def _specallocate_setup(self, m, i):
509 for (idx, c) in enumerate(self.chain):
510 if hasattr(c, "setup"):
511 c.setup(m, i) # stage may have some module stuff
512 o = self.chain[idx].ospec() # last assignment survives
513 m.d.comb += eq(o, c.process(i)) # process input into "o"
514 if idx == len(self.chain)-1:
515 break
516 i = self.chain[idx+1].ispec() # new input on next loop
517 m.d.comb += eq(i, o) # assign to next input
518 return o # last loop is the output
519
520 def _noallocate_setup(self, m, i):
521 for (idx, c) in enumerate(self.chain):
522 if hasattr(c, "setup"):
523 c.setup(m, i) # stage may have some module stuff
524 i = o = c.process(i) # store input into "o"
525 return o # last loop is the output
526
527 def setup(self, m, i):
528 if self.specallocate:
529 self.o = self._specallocate_setup(m, i)
530 else:
531 self.o = self._noallocate_setup(m, i)
532
533 def process(self, i):
534 return self.o # conform to Stage API: return last-loop output
535
536
537 class ControlBase:
538 """ Common functions for Pipeline API
539 """
540 def __init__(self, stage=None, in_multi=None, stage_ctl=False):
541 """ Base class containing ready/valid/data to previous and next stages
542
543 * p: contains ready/valid to the previous stage
544 * n: contains ready/valid to the next stage
545
546 Except when calling Controlbase.connect(), user must also:
547 * add i_data member to PrevControl (p) and
548 * add o_data member to NextControl (n)
549 """
550 self.stage = stage
551
552 # set up input and output IO ACK (prev/next ready/valid)
553 self.p = PrevControl(in_multi, stage_ctl)
554 self.n = NextControl(stage_ctl)
555
556 # set up the input and output data
557 if stage is not None:
558 self.p.i_data = stage.ispec() # input type
559 self.n.o_data = stage.ospec()
560
561 def connect_to_next(self, nxt):
562 """ helper function to connect to the next stage data/valid/ready.
563 """
564 return self.n.connect_to_next(nxt.p)
565
566 def _connect_in(self, prev):
567 """ internal helper function to connect stage to an input source.
568 do not use to connect stage-to-stage!
569 """
570 return self.p._connect_in(prev.p)
571
572 def _connect_out(self, nxt):
573 """ internal helper function to connect stage to an output source.
574 do not use to connect stage-to-stage!
575 """
576 return self.n._connect_out(nxt.n)
577
578 def connect(self, pipechain):
579 """ connects a chain (list) of Pipeline instances together and
580 links them to this ControlBase instance:
581
582 in <----> self <---> out
583 | ^
584 v |
585 [pipe1, pipe2, pipe3, pipe4]
586 | ^ | ^ | ^
587 v | v | v |
588 out---in out--in out---in
589
590 Also takes care of allocating i_data/o_data, by looking up
591 the data spec for each end of the pipechain. i.e It is NOT
592 necessary to allocate self.p.i_data or self.n.o_data manually:
593 this is handled AUTOMATICALLY, here.
594
595 Basically this function is the direct equivalent of StageChain,
596 except that unlike StageChain, the Pipeline logic is followed.
597
598 Just as StageChain presents an object that conforms to the
599 Stage API from a list of objects that also conform to the
600 Stage API, an object that calls this Pipeline connect function
601 has the exact same pipeline API as the list of pipline objects
602 it is called with.
603
604 Thus it becomes possible to build up larger chains recursively.
605 More complex chains (multi-input, multi-output) will have to be
606 done manually.
607 """
608 eqs = [] # collated list of assignment statements
609
610 # connect inter-chain
611 for i in range(len(pipechain)-1):
612 pipe1 = pipechain[i]
613 pipe2 = pipechain[i+1]
614 eqs += pipe1.connect_to_next(pipe2)
615
616 # connect front of chain to ourselves
617 front = pipechain[0]
618 self.p.i_data = front.stage.ispec()
619 eqs += front._connect_in(self)
620
621 # connect end of chain to ourselves
622 end = pipechain[-1]
623 self.n.o_data = end.stage.ospec()
624 eqs += end._connect_out(self)
625
626 return eqs
627
628 def _postprocess(self, i): # XXX DISABLED
629 return i # RETURNS INPUT
630 if hasattr(self.stage, "postprocess"):
631 return self.stage.postprocess(i)
632 return i
633
634 def set_input(self, i):
635 """ helper function to set the input data
636 """
637 return eq(self.p.i_data, i)
638
639 def ports(self):
640 res = [self.p.i_valid, self.n.i_ready,
641 self.n.o_valid, self.p.o_ready,
642 ]
643 if hasattr(self.p.i_data, "ports"):
644 res += self.p.i_data.ports()
645 else:
646 res += self.p.i_data
647 if hasattr(self.n.o_data, "ports"):
648 res += self.n.o_data.ports()
649 else:
650 res += self.n.o_data
651 return res
652
653 def _elaborate(self, platform):
654 """ handles case where stage has dynamic ready/valid functions
655 """
656 m = Module()
657
658 if self.stage is not None and hasattr(self.stage, "setup"):
659 self.stage.setup(m, self.p.i_data)
660
661 if not self.p.stage_ctl:
662 return m
663
664 # intercept the previous (outgoing) "ready", combine with stage ready
665 m.d.comb += self.p.s_o_ready.eq(self.p._o_ready & self.stage.d_ready)
666
667 # intercept the next (incoming) "ready" and combine it with data valid
668 sdv = self.stage.d_valid(self.n.i_ready)
669 m.d.comb += self.n.d_valid.eq(self.n.i_ready & sdv)
670
671 return m
672
673
674 class BufferedHandshake(ControlBase):
675 """ buffered pipeline stage. data and strobe signals travel in sync.
676 if ever the input is ready and the output is not, processed data
677 is shunted in a temporary register.
678
679 Argument: stage. see Stage API above
680
681 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
682 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
683 stage-1 p.i_data >>in stage n.o_data out>> stage+1
684 | |
685 process --->----^
686 | |
687 +-- r_data ->-+
688
689 input data p.i_data is read (only), is processed and goes into an
690 intermediate result store [process()]. this is updated combinatorially.
691
692 in a non-stall condition, the intermediate result will go into the
693 output (update_output). however if ever there is a stall, it goes
694 into r_data instead [update_buffer()].
695
696 when the non-stall condition is released, r_data is the first
697 to be transferred to the output [flush_buffer()], and the stall
698 condition cleared.
699
700 on the next cycle (as long as stall is not raised again) the
701 input may begin to be processed and transferred directly to output.
702 """
703
704 def elaborate(self, platform):
705 self.m = ControlBase._elaborate(self, platform)
706
707 result = self.stage.ospec()
708 r_data = self.stage.ospec()
709
710 # establish some combinatorial temporaries
711 o_n_validn = Signal(reset_less=True)
712 n_i_ready = Signal(reset_less=True, name="n_i_rdy_data")
713 nir_por = Signal(reset_less=True)
714 nir_por_n = Signal(reset_less=True)
715 p_i_valid = Signal(reset_less=True)
716 nir_novn = Signal(reset_less=True)
717 nirn_novn = Signal(reset_less=True)
718 por_pivn = Signal(reset_less=True)
719 npnn = Signal(reset_less=True)
720 self.m.d.comb += [p_i_valid.eq(self.p.i_valid_test),
721 o_n_validn.eq(~self.n.o_valid),
722 n_i_ready.eq(self.n.i_ready_test),
723 nir_por.eq(n_i_ready & self.p._o_ready),
724 nir_por_n.eq(n_i_ready & ~self.p._o_ready),
725 nir_novn.eq(n_i_ready | o_n_validn),
726 nirn_novn.eq(~n_i_ready & o_n_validn),
727 npnn.eq(nir_por | nirn_novn),
728 por_pivn.eq(self.p._o_ready & ~p_i_valid)
729 ]
730
731 # store result of processing in combinatorial temporary
732 self.m.d.comb += eq(result, self.stage.process(self.p.i_data))
733
734 # if not in stall condition, update the temporary register
735 with self.m.If(self.p.o_ready): # not stalled
736 self.m.d.sync += eq(r_data, result) # update buffer
737
738 # data pass-through conditions
739 with self.m.If(npnn):
740 o_data = self._postprocess(result)
741 self.m.d.sync += [self.n.o_valid.eq(p_i_valid), # valid if p_valid
742 eq(self.n.o_data, o_data), # update output
743 ]
744 # buffer flush conditions (NOTE: can override data passthru conditions)
745 with self.m.If(nir_por_n): # not stalled
746 # Flush the [already processed] buffer to the output port.
747 o_data = self._postprocess(r_data)
748 self.m.d.sync += [self.n.o_valid.eq(1), # reg empty
749 eq(self.n.o_data, o_data), # flush buffer
750 ]
751 # output ready conditions
752 self.m.d.sync += self.p._o_ready.eq(nir_novn | por_pivn)
753
754 return self.m
755
756
757 class SimpleHandshake(ControlBase):
758 """ simple handshake control. data and strobe signals travel in sync.
759 implements the protocol used by Wishbone and AXI4.
760
761 Argument: stage. see Stage API above
762
763 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
764 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
765 stage-1 p.i_data >>in stage n.o_data out>> stage+1
766 | |
767 +--process->--^
768 Truth Table
769
770 Inputs Temporary Output Data
771 ------- ---------- ----- ----
772 P P N N PiV& ~NiR& N P
773 i o i o PoR NoV o o
774 V R R V V R
775
776 ------- - - - -
777 0 0 0 0 0 0 >0 0 reg
778 0 0 0 1 0 1 >1 0 reg
779 0 0 1 0 0 0 0 1 process(i_data)
780 0 0 1 1 0 0 0 1 process(i_data)
781 ------- - - - -
782 0 1 0 0 0 0 >0 0 reg
783 0 1 0 1 0 1 >1 0 reg
784 0 1 1 0 0 0 0 1 process(i_data)
785 0 1 1 1 0 0 0 1 process(i_data)
786 ------- - - - -
787 1 0 0 0 0 0 >0 0 reg
788 1 0 0 1 0 1 >1 0 reg
789 1 0 1 0 0 0 0 1 process(i_data)
790 1 0 1 1 0 0 0 1 process(i_data)
791 ------- - - - -
792 1 1 0 0 1 0 1 0 process(i_data)
793 1 1 0 1 1 1 1 0 process(i_data)
794 1 1 1 0 1 0 1 1 process(i_data)
795 1 1 1 1 1 0 1 1 process(i_data)
796 ------- - - - -
797 """
798
799 def elaborate(self, platform):
800 self.m = m = ControlBase._elaborate(self, platform)
801
802 r_busy = Signal()
803 result = self.stage.ospec()
804
805 # establish some combinatorial temporaries
806 n_i_ready = Signal(reset_less=True, name="n_i_rdy_data")
807 p_i_valid_p_o_ready = Signal(reset_less=True)
808 p_i_valid = Signal(reset_less=True)
809 m.d.comb += [p_i_valid.eq(self.p.i_valid_test),
810 n_i_ready.eq(self.n.i_ready_test),
811 p_i_valid_p_o_ready.eq(p_i_valid & self.p.o_ready),
812 ]
813
814 # store result of processing in combinatorial temporary
815 m.d.comb += eq(result, self.stage.process(self.p.i_data))
816
817 # previous valid and ready
818 with m.If(p_i_valid_p_o_ready):
819 o_data = self._postprocess(result)
820 m.d.sync += [r_busy.eq(1), # output valid
821 eq(self.n.o_data, o_data), # update output
822 ]
823 # previous invalid or not ready, however next is accepting
824 with m.Elif(n_i_ready):
825 o_data = self._postprocess(result)
826 m.d.sync += [eq(self.n.o_data, o_data)]
827 # TODO: could still send data here (if there was any)
828 #m.d.sync += self.n.o_valid.eq(0) # ...so set output invalid
829 m.d.sync += r_busy.eq(0) # ...so set output invalid
830
831 m.d.comb += self.n.o_valid.eq(r_busy)
832 # if next is ready, so is previous
833 m.d.comb += self.p._o_ready.eq(n_i_ready)
834
835 return self.m
836
837
838 class UnbufferedPipeline(ControlBase):
839 """ A simple pipeline stage with single-clock synchronisation
840 and two-way valid/ready synchronised signalling.
841
842 Note that a stall in one stage will result in the entire pipeline
843 chain stalling.
844
845 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
846 travel synchronously with the data: the valid/ready signalling
847 combines in a *combinatorial* fashion. Therefore, a long pipeline
848 chain will lengthen propagation delays.
849
850 Argument: stage. see Stage API, above
851
852 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
853 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
854 stage-1 p.i_data >>in stage n.o_data out>> stage+1
855 | |
856 r_data result
857 | |
858 +--process ->-+
859
860 Attributes:
861 -----------
862 p.i_data : StageInput, shaped according to ispec
863 The pipeline input
864 p.o_data : StageOutput, shaped according to ospec
865 The pipeline output
866 r_data : input_shape according to ispec
867 A temporary (buffered) copy of a prior (valid) input.
868 This is HELD if the output is not ready. It is updated
869 SYNCHRONOUSLY.
870 result: output_shape according to ospec
871 The output of the combinatorial logic. it is updated
872 COMBINATORIALLY (no clock dependence).
873
874 Truth Table
875
876 Inputs Temp Output Data
877 ------- - ----- ----
878 P P N N ~NiR& N P
879 i o i o NoV o o
880 V R R V V R
881
882 ------- - - -
883 0 0 0 0 0 0 1 reg
884 0 0 0 1 1 1 0 reg
885 0 0 1 0 0 0 1 reg
886 0 0 1 1 0 0 1 reg
887 ------- - - -
888 0 1 0 0 0 0 1 reg
889 0 1 0 1 1 1 0 reg
890 0 1 1 0 0 0 1 reg
891 0 1 1 1 0 0 1 reg
892 ------- - - -
893 1 0 0 0 0 1 1 reg
894 1 0 0 1 1 1 0 reg
895 1 0 1 0 0 1 1 reg
896 1 0 1 1 0 1 1 reg
897 ------- - - -
898 1 1 0 0 0 1 1 process(i_data)
899 1 1 0 1 1 1 0 process(i_data)
900 1 1 1 0 0 1 1 process(i_data)
901 1 1 1 1 0 1 1 process(i_data)
902 ------- - - -
903
904 Note: PoR is *NOT* involved in the above decision-making.
905 """
906
907 def elaborate(self, platform):
908 self.m = m = ControlBase._elaborate(self, platform)
909
910 data_valid = Signal() # is data valid or not
911 r_data = self.stage.ospec() # output type
912
913 # some temporaries
914 p_i_valid = Signal(reset_less=True)
915 pv = Signal(reset_less=True)
916 buf_full = Signal(reset_less=True)
917 m.d.comb += p_i_valid.eq(self.p.i_valid_test)
918 m.d.comb += pv.eq(self.p.i_valid & self.p.o_ready)
919 m.d.comb += buf_full.eq(~self.n.i_ready_test & data_valid)
920
921 m.d.comb += self.n.o_valid.eq(data_valid)
922 m.d.comb += self.p._o_ready.eq(~data_valid | self.n.i_ready_test)
923 m.d.sync += data_valid.eq(p_i_valid | buf_full)
924
925 with m.If(pv):
926 m.d.sync += eq(r_data, self.stage.process(self.p.i_data))
927 o_data = self._postprocess(r_data)
928 m.d.comb += eq(self.n.o_data, o_data)
929
930 return self.m
931
932 class UnbufferedPipeline2(ControlBase):
933 """ A simple pipeline stage with single-clock synchronisation
934 and two-way valid/ready synchronised signalling.
935
936 Note that a stall in one stage will result in the entire pipeline
937 chain stalling.
938
939 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
940 travel synchronously with the data: the valid/ready signalling
941 combines in a *combinatorial* fashion. Therefore, a long pipeline
942 chain will lengthen propagation delays.
943
944 Argument: stage. see Stage API, above
945
946 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
947 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
948 stage-1 p.i_data >>in stage n.o_data out>> stage+1
949 | | |
950 +- process-> buf <-+
951 Attributes:
952 -----------
953 p.i_data : StageInput, shaped according to ispec
954 The pipeline input
955 p.o_data : StageOutput, shaped according to ospec
956 The pipeline output
957 buf : output_shape according to ospec
958 A temporary (buffered) copy of a valid output
959 This is HELD if the output is not ready. It is updated
960 SYNCHRONOUSLY.
961
962 Inputs Temp Output Data
963 ------- - -----
964 P P N N ~NiR& N P (buf_full)
965 i o i o NoV o o
966 V R R V V R
967
968 ------- - - -
969 0 0 0 0 0 0 1 process(i_data)
970 0 0 0 1 1 1 0 reg (odata, unchanged)
971 0 0 1 0 0 0 1 process(i_data)
972 0 0 1 1 0 0 1 process(i_data)
973 ------- - - -
974 0 1 0 0 0 0 1 process(i_data)
975 0 1 0 1 1 1 0 reg (odata, unchanged)
976 0 1 1 0 0 0 1 process(i_data)
977 0 1 1 1 0 0 1 process(i_data)
978 ------- - - -
979 1 0 0 0 0 1 1 process(i_data)
980 1 0 0 1 1 1 0 reg (odata, unchanged)
981 1 0 1 0 0 1 1 process(i_data)
982 1 0 1 1 0 1 1 process(i_data)
983 ------- - - -
984 1 1 0 0 0 1 1 process(i_data)
985 1 1 0 1 1 1 0 reg (odata, unchanged)
986 1 1 1 0 0 1 1 process(i_data)
987 1 1 1 1 0 1 1 process(i_data)
988 ------- - - -
989
990 Note: PoR is *NOT* involved in the above decision-making.
991 """
992
993 def elaborate(self, platform):
994 self.m = m = ControlBase._elaborate(self, platform)
995
996 buf_full = Signal() # is data valid or not
997 buf = self.stage.ospec() # output type
998
999 # some temporaries
1000 p_i_valid = Signal(reset_less=True)
1001 m.d.comb += p_i_valid.eq(self.p.i_valid_test)
1002
1003 m.d.comb += self.n.o_valid.eq(buf_full | p_i_valid)
1004 m.d.comb += self.p._o_ready.eq(~buf_full)
1005 m.d.sync += buf_full.eq(~self.n.i_ready_test & self.n.o_valid)
1006
1007 o_data = Mux(buf_full, buf, self.stage.process(self.p.i_data))
1008 o_data = self._postprocess(o_data)
1009 m.d.comb += eq(self.n.o_data, o_data)
1010 m.d.sync += eq(buf, self.n.o_data)
1011
1012 return self.m
1013
1014
1015 class PassThroughStage(StageCls):
1016 """ a pass-through stage which has its input data spec equal to its output,
1017 and "passes through" its data from input to output.
1018 """
1019 def __init__(self, iospecfn):
1020 self.iospecfn = iospecfn
1021 def ispec(self): return self.iospecfn()
1022 def ospec(self): return self.iospecfn()
1023 def process(self, i): return i
1024
1025
1026 class PassThroughHandshake(ControlBase):
1027 """ A control block that delays by one clock cycle.
1028
1029 Inputs Temporary Output Data
1030 ------- ------------------ ----- ----
1031 P P N N PiV& PiV| NiR| pvr N P (pvr)
1032 i o i o PoR ~PoR ~NoV o o
1033 V R R V V R
1034
1035 ------- - - - - - -
1036 0 0 0 0 0 1 1 0 1 1 odata (unchanged)
1037 0 0 0 1 0 1 0 0 1 0 odata (unchanged)
1038 0 0 1 0 0 1 1 0 1 1 odata (unchanged)
1039 0 0 1 1 0 1 1 0 1 1 odata (unchanged)
1040 ------- - - - - - -
1041 0 1 0 0 0 0 1 0 0 1 odata (unchanged)
1042 0 1 0 1 0 0 0 0 0 0 odata (unchanged)
1043 0 1 1 0 0 0 1 0 0 1 odata (unchanged)
1044 0 1 1 1 0 0 1 0 0 1 odata (unchanged)
1045 ------- - - - - - -
1046 1 0 0 0 0 1 1 1 1 1 process(in)
1047 1 0 0 1 0 1 0 0 1 0 odata (unchanged)
1048 1 0 1 0 0 1 1 1 1 1 process(in)
1049 1 0 1 1 0 1 1 1 1 1 process(in)
1050 ------- - - - - - -
1051 1 1 0 0 1 1 1 1 1 1 process(in)
1052 1 1 0 1 1 1 0 0 1 0 odata (unchanged)
1053 1 1 1 0 1 1 1 1 1 1 process(in)
1054 1 1 1 1 1 1 1 1 1 1 process(in)
1055 ------- - - - - - -
1056
1057 """
1058
1059 def elaborate(self, platform):
1060 self.m = m = ControlBase._elaborate(self, platform)
1061
1062 r_data = self.stage.ospec() # output type
1063
1064 # temporaries
1065 p_i_valid = Signal(reset_less=True)
1066 pvr = Signal(reset_less=True)
1067 m.d.comb += p_i_valid.eq(self.p.i_valid_test)
1068 m.d.comb += pvr.eq(p_i_valid & self.p.o_ready)
1069
1070 m.d.comb += self.p.o_ready.eq(~self.n.o_valid | self.n.i_ready_test)
1071 m.d.sync += self.n.o_valid.eq(p_i_valid | ~self.p.o_ready)
1072
1073 odata = Mux(pvr, self.stage.process(self.p.i_data), r_data)
1074 m.d.sync += eq(r_data, odata)
1075 r_data = self._postprocess(r_data)
1076 m.d.comb += eq(self.n.o_data, r_data)
1077
1078 return m
1079
1080
1081 class RegisterPipeline(UnbufferedPipeline):
1082 """ A pipeline stage that delays by one clock cycle, creating a
1083 sync'd latch out of o_data and o_valid as an indirect byproduct
1084 of using PassThroughStage
1085 """
1086 def __init__(self, iospecfn):
1087 UnbufferedPipeline.__init__(self, PassThroughStage(iospecfn))
1088
1089
1090 class FIFOControl(ControlBase):
1091 """ FIFO Control. Uses SyncFIFO to store data, coincidentally
1092 happens to have same valid/ready signalling as Stage API.
1093
1094 i_data -> fifo.din -> FIFO -> fifo.dout -> o_data
1095 """
1096
1097 def __init__(self, depth, stage, in_multi=None, stage_ctl=False,
1098 fwft=True, buffered=False, pipe=False):
1099 """ FIFO Control
1100
1101 * depth: number of entries in the FIFO
1102 * stage: data processing block
1103 * fwft : first word fall-thru mode (non-fwft introduces delay)
1104 * buffered: use buffered FIFO (introduces extra cycle delay)
1105
1106 NOTE 1: FPGAs may have trouble with the defaults for SyncFIFO
1107 (fwft=True, buffered=False)
1108
1109 NOTE 2: i_data *must* have a shape function. it can therefore
1110 be a Signal, or a Record, or a RecordObject.
1111
1112 data is processed (and located) as follows:
1113
1114 self.p self.stage temp fn temp fn temp fp self.n
1115 i_data->process()->result->flatten->din.FIFO.dout->flatten(o_data)
1116
1117 yes, really: flatten produces a Cat() which can be assigned to.
1118 this is how the FIFO gets de-flattened without needing a de-flatten
1119 function
1120 """
1121
1122 assert not (fwft and buffered), "buffered cannot do fwft"
1123 if buffered:
1124 depth += 1
1125 self.fwft = fwft
1126 self.buffered = buffered
1127 self.pipe = pipe
1128 self.fdepth = depth
1129 ControlBase.__init__(self, stage, in_multi, stage_ctl)
1130
1131 def elaborate(self, platform):
1132 self.m = m = ControlBase._elaborate(self, platform)
1133
1134 # make a FIFO with a signal of equal width to the o_data.
1135 (fwidth, _) = self.n.o_data.shape()
1136 if self.buffered:
1137 fifo = SyncFIFOBuffered(fwidth, self.fdepth)
1138 else:
1139 fifo = Queue(fwidth, self.fdepth, fwft=self.fwft, pipe=self.pipe)
1140 m.submodules.fifo = fifo
1141
1142 # store result of processing in combinatorial temporary
1143 result = self.stage.ospec()
1144 m.d.comb += eq(result, self.stage.process(self.p.i_data))
1145
1146 # connect previous rdy/valid/data - do flatten on i_data
1147 # NOTE: cannot do the PrevControl-looking trick because
1148 # of need to process the data. shaaaame....
1149 m.d.comb += [fifo.we.eq(self.p.i_valid_test),
1150 self.p.o_ready.eq(fifo.writable),
1151 eq(fifo.din, flatten(result)),
1152 ]
1153
1154 # connect next rdy/valid/data - do flatten on o_data
1155 connections = [self.n.o_valid.eq(fifo.readable),
1156 fifo.re.eq(self.n.i_ready_test),
1157 ]
1158 if self.fwft or self.buffered:
1159 m.d.comb += connections
1160 else:
1161 m.d.sync += connections # unbuffered fwft mode needs sync
1162 o_data = flatten(self.n.o_data).eq(fifo.dout)
1163 o_data = self._postprocess(o_data)
1164 m.d.comb += o_data
1165
1166 return m
1167
1168
1169 # aka "RegStage".
1170 class UnbufferedPipeline(FIFOControl):
1171 def __init__(self, stage, in_multi=None, stage_ctl=False):
1172 FIFOControl.__init__(self, 1, stage, in_multi, stage_ctl,
1173 fwft=True, pipe=False)
1174
1175 # aka "BreakReadyStage" XXX had to set fwft=True to get it to work
1176 class PassThroughHandshake(FIFOControl):
1177 def __init__(self, stage, in_multi=None, stage_ctl=False):
1178 FIFOControl.__init__(self, 1, stage, in_multi, stage_ctl,
1179 fwft=True, pipe=True)
1180
1181 # this is *probably* BufferedHandshake, although test #997 now succeeds.
1182 class BufferedHandshake(FIFOControl):
1183 def __init__(self, stage, in_multi=None, stage_ctl=False):
1184 FIFOControl.__init__(self, 2, stage, in_multi, stage_ctl,
1185 fwft=True, pipe=False)
1186
1187
1188 """
1189 # this is *probably* SimpleHandshake (note: memory cell size=0)
1190 class SimpleHandshake(FIFOControl):
1191 def __init__(self, stage, in_multi=None, stage_ctl=False):
1192 FIFOControl.__init__(self, 0, stage, in_multi, stage_ctl,
1193 fwft=True, pipe=False)
1194 """