add name-detector to ispec / ospec, call with extra name arg if available
[ieee754fpu.git] / src / add / singlepipe.py
1 """ Pipeline and BufferedHandshake implementation, conforming to the same API.
2 For multi-input and multi-output variants, see multipipe.
3
4 Associated development bugs:
5 * http://bugs.libre-riscv.org/show_bug.cgi?id=64
6 * http://bugs.libre-riscv.org/show_bug.cgi?id=57
7
8 eq:
9 --
10
11 a strategically very important function that is identical in function
12 to nmigen's Signal.eq function, except it may take objects, or a list
13 of objects, or a tuple of objects, and where objects may also be
14 Records.
15
16 Stage API:
17 ---------
18
19 stage requires compliance with a strict API that may be
20 implemented in several means, including as a static class.
21 the methods of a stage instance must be as follows:
22
23 * ispec() - Input data format specification
24 returns an object or a list or tuple of objects, or
25 a Record, each object having an "eq" function which
26 takes responsibility for copying by assignment all
27 sub-objects
28 * ospec() - Output data format specification
29 requirements as for ospec
30 * process(m, i) - Processes an ispec-formatted object
31 returns a combinatorial block of a result that
32 may be assigned to the output, by way of the "eq"
33 function
34 * setup(m, i) - Optional function for setting up submodules
35 may be used for more complex stages, to link
36 the input (i) to submodules. must take responsibility
37 for adding those submodules to the module (m).
38 the submodules must be combinatorial blocks and
39 must have their inputs and output linked combinatorially.
40
41 Both StageCls (for use with non-static classes) and Stage (for use
42 by static classes) are abstract classes from which, for convenience
43 and as a courtesy to other developers, anything conforming to the
44 Stage API may *choose* to derive.
45
46 StageChain:
47 ----------
48
49 A useful combinatorial wrapper around stages that chains them together
50 and then presents a Stage-API-conformant interface. By presenting
51 the same API as the stages it wraps, it can clearly be used recursively.
52
53 RecordBasedStage:
54 ----------------
55
56 A convenience class that takes an input shape, output shape, a
57 "processing" function and an optional "setup" function. Honestly
58 though, there's not much more effort to just... create a class
59 that returns a couple of Records (see ExampleAddRecordStage in
60 examples).
61
62 PassThroughStage:
63 ----------------
64
65 A convenience class that takes a single function as a parameter,
66 that is chain-called to create the exact same input and output spec.
67 It has a process() function that simply returns its input.
68
69 Instances of this class are completely redundant if handed to
70 StageChain, however when passed to UnbufferedPipeline they
71 can be used to introduce a single clock delay.
72
73 ControlBase:
74 -----------
75
76 The base class for pipelines. Contains previous and next ready/valid/data.
77 Also has an extremely useful "connect" function that can be used to
78 connect a chain of pipelines and present the exact same prev/next
79 ready/valid/data API.
80
81 UnbufferedPipeline:
82 ------------------
83
84 A simple stalling clock-synchronised pipeline that has no buffering
85 (unlike BufferedHandshake). Data flows on *every* clock cycle when
86 the conditions are right (this is nominally when the input is valid
87 and the output is ready).
88
89 A stall anywhere along the line will result in a stall back-propagating
90 down the entire chain. The BufferedHandshake by contrast will buffer
91 incoming data, allowing previous stages one clock cycle's grace before
92 also having to stall.
93
94 An advantage of the UnbufferedPipeline over the Buffered one is
95 that the amount of logic needed (number of gates) is greatly
96 reduced (no second set of buffers basically)
97
98 The disadvantage of the UnbufferedPipeline is that the valid/ready
99 logic, if chained together, is *combinatorial*, resulting in
100 progressively larger gate delay.
101
102 PassThroughHandshake:
103 ------------------
104
105 A Control class that introduces a single clock delay, passing its
106 data through unaltered. Unlike RegisterPipeline (which relies
107 on UnbufferedPipeline and PassThroughStage) it handles ready/valid
108 itself.
109
110 RegisterPipeline:
111 ----------------
112
113 A convenience class that, because UnbufferedPipeline introduces a single
114 clock delay, when its stage is a PassThroughStage, it results in a Pipeline
115 stage that, duh, delays its (unmodified) input by one clock cycle.
116
117 BufferedHandshake:
118 ----------------
119
120 nmigen implementation of buffered pipeline stage, based on zipcpu:
121 https://zipcpu.com/blog/2017/08/14/strategies-for-pipelining.html
122
123 this module requires quite a bit of thought to understand how it works
124 (and why it is needed in the first place). reading the above is
125 *strongly* recommended.
126
127 unlike john dawson's IEEE754 FPU STB/ACK signalling, which requires
128 the STB / ACK signals to raise and lower (on separate clocks) before
129 data may proceeed (thus only allowing one piece of data to proceed
130 on *ALTERNATE* cycles), the signalling here is a true pipeline
131 where data will flow on *every* clock when the conditions are right.
132
133 input acceptance conditions are when:
134 * incoming previous-stage strobe (p.i_valid) is HIGH
135 * outgoing previous-stage ready (p.o_ready) is LOW
136
137 output transmission conditions are when:
138 * outgoing next-stage strobe (n.o_valid) is HIGH
139 * outgoing next-stage ready (n.i_ready) is LOW
140
141 the tricky bit is when the input has valid data and the output is not
142 ready to accept it. if it wasn't for the clock synchronisation, it
143 would be possible to tell the input "hey don't send that data, we're
144 not ready". unfortunately, it's not possible to "change the past":
145 the previous stage *has no choice* but to pass on its data.
146
147 therefore, the incoming data *must* be accepted - and stored: that
148 is the responsibility / contract that this stage *must* accept.
149 on the same clock, it's possible to tell the input that it must
150 not send any more data. this is the "stall" condition.
151
152 we now effectively have *two* possible pieces of data to "choose" from:
153 the buffered data, and the incoming data. the decision as to which
154 to process and output is based on whether we are in "stall" or not.
155 i.e. when the next stage is no longer ready, the output comes from
156 the buffer if a stall had previously occurred, otherwise it comes
157 direct from processing the input.
158
159 this allows us to respect a synchronous "travelling STB" with what
160 dan calls a "buffered handshake".
161
162 it's quite a complex state machine!
163
164 SimpleHandshake
165 ---------------
166
167 Synchronised pipeline, Based on:
168 https://github.com/ZipCPU/dbgbus/blob/master/hexbus/rtl/hbdeword.v
169 """
170
171 from nmigen import Signal, Cat, Const, Mux, Module, Value, Elaboratable
172 from nmigen.cli import verilog, rtlil
173 from nmigen.lib.fifo import SyncFIFO, SyncFIFOBuffered
174 from nmigen.hdl.ast import ArrayProxy
175 from nmigen.hdl.rec import Record, Layout
176
177 from abc import ABCMeta, abstractmethod
178 from collections.abc import Sequence, Iterable
179 from collections import OrderedDict
180 from queue import Queue
181 import inspect
182
183
184 class Object:
185 def __init__(self):
186 self.fields = OrderedDict()
187
188 def __setattr__(self, k, v):
189 print ("kv", k, v)
190 if (k.startswith('_') or k in ["fields", "name", "src_loc"] or
191 k in dir(Object) or "fields" not in self.__dict__):
192 return object.__setattr__(self, k, v)
193 self.fields[k] = v
194
195 def __getattr__(self, k):
196 if k in self.__dict__:
197 return object.__getattr__(self, k)
198 try:
199 return self.fields[k]
200 except KeyError as e:
201 raise AttributeError(e)
202
203 def __iter__(self):
204 for x in self.fields.values():
205 if isinstance(x, Iterable):
206 yield from x
207 else:
208 yield x
209
210 def eq(self, inp):
211 res = []
212 for (k, o) in self.fields.items():
213 i = getattr(inp, k)
214 print ("eq", o, i)
215 rres = o.eq(i)
216 if isinstance(rres, Sequence):
217 res += rres
218 else:
219 res.append(rres)
220 print (res)
221 return res
222
223 def ports(self):
224 return list(self)
225
226
227 class RecordObject(Record):
228 def __init__(self, layout=None, name=None):
229 Record.__init__(self, layout=layout or [], name=None)
230
231 def __setattr__(self, k, v):
232 #print (dir(Record))
233 if (k.startswith('_') or k in ["fields", "name", "src_loc"] or
234 k in dir(Record) or "fields" not in self.__dict__):
235 return object.__setattr__(self, k, v)
236 self.fields[k] = v
237 #print ("RecordObject setattr", k, v)
238 if isinstance(v, Record):
239 newlayout = {k: (k, v.layout)}
240 elif isinstance(v, Value):
241 newlayout = {k: (k, v.shape())}
242 else:
243 newlayout = {k: (k, shape(v))}
244 self.layout.fields.update(newlayout)
245
246 def __iter__(self):
247 for x in self.fields.values():
248 if isinstance(x, Iterable):
249 yield from x
250 else:
251 yield x
252
253 def ports(self):
254 return list(self)
255
256
257 def _spec(fn, name=None):
258 if name is None:
259 return fn()
260 varnames = dict(inspect.getmembers(fn.__code__))['co_varnames']
261 if 'name' in varnames:
262 return fn(name=name)
263 return fn()
264
265
266 class PrevControl(Elaboratable):
267 """ contains signals that come *from* the previous stage (both in and out)
268 * i_valid: previous stage indicating all incoming data is valid.
269 may be a multi-bit signal, where all bits are required
270 to be asserted to indicate "valid".
271 * o_ready: output to next stage indicating readiness to accept data
272 * i_data : an input - added by the user of this class
273 """
274
275 def __init__(self, i_width=1, stage_ctl=False):
276 self.stage_ctl = stage_ctl
277 self.i_valid = Signal(i_width, name="p_i_valid") # prev >>in self
278 self._o_ready = Signal(name="p_o_ready") # prev <<out self
279 self.i_data = None # XXX MUST BE ADDED BY USER
280 if stage_ctl:
281 self.s_o_ready = Signal(name="p_s_o_rdy") # prev <<out self
282 self.trigger = Signal(reset_less=True)
283
284 @property
285 def o_ready(self):
286 """ public-facing API: indicates (externally) that stage is ready
287 """
288 if self.stage_ctl:
289 return self.s_o_ready # set dynamically by stage
290 return self._o_ready # return this when not under dynamic control
291
292 def _connect_in(self, prev, direct=False, fn=None):
293 """ internal helper function to connect stage to an input source.
294 do not use to connect stage-to-stage!
295 """
296 i_valid = prev.i_valid if direct else prev.i_valid_test
297 i_data = fn(prev.i_data) if fn is not None else prev.i_data
298 return [self.i_valid.eq(i_valid),
299 prev.o_ready.eq(self.o_ready),
300 eq(self.i_data, i_data),
301 ]
302
303 @property
304 def i_valid_test(self):
305 vlen = len(self.i_valid)
306 if vlen > 1:
307 # multi-bit case: valid only when i_valid is all 1s
308 all1s = Const(-1, (len(self.i_valid), False))
309 i_valid = (self.i_valid == all1s)
310 else:
311 # single-bit i_valid case
312 i_valid = self.i_valid
313
314 # when stage indicates not ready, incoming data
315 # must "appear" to be not ready too
316 if self.stage_ctl:
317 i_valid = i_valid & self.s_o_ready
318
319 return i_valid
320
321 def elaborate(self, platform):
322 m = Module()
323 m.d.comb += self.trigger.eq(self.i_valid_test & self.o_ready)
324 return m
325
326 def eq(self, i):
327 return [self.i_data.eq(i.i_data),
328 self.o_ready.eq(i.o_ready),
329 self.i_valid.eq(i.i_valid)]
330
331 def __iter__(self):
332 yield self.i_valid
333 yield self.o_ready
334 if hasattr(self.i_data, "ports"):
335 yield from self.i_data.ports()
336 elif isinstance(self.i_data, Sequence):
337 yield from self.i_data
338 else:
339 yield self.i_data
340
341 def ports(self):
342 return list(self)
343
344
345 class NextControl(Elaboratable):
346 """ contains the signals that go *to* the next stage (both in and out)
347 * o_valid: output indicating to next stage that data is valid
348 * i_ready: input from next stage indicating that it can accept data
349 * o_data : an output - added by the user of this class
350 """
351 def __init__(self, stage_ctl=False):
352 self.stage_ctl = stage_ctl
353 self.o_valid = Signal(name="n_o_valid") # self out>> next
354 self.i_ready = Signal(name="n_i_ready") # self <<in next
355 self.o_data = None # XXX MUST BE ADDED BY USER
356 #if self.stage_ctl:
357 self.d_valid = Signal(reset=1) # INTERNAL (data valid)
358 self.trigger = Signal(reset_less=True)
359
360 @property
361 def i_ready_test(self):
362 if self.stage_ctl:
363 return self.i_ready & self.d_valid
364 return self.i_ready
365
366 def connect_to_next(self, nxt):
367 """ helper function to connect to the next stage data/valid/ready.
368 data/valid is passed *TO* nxt, and ready comes *IN* from nxt.
369 use this when connecting stage-to-stage
370 """
371 return [nxt.i_valid.eq(self.o_valid),
372 self.i_ready.eq(nxt.o_ready),
373 eq(nxt.i_data, self.o_data),
374 ]
375
376 def _connect_out(self, nxt, direct=False, fn=None):
377 """ internal helper function to connect stage to an output source.
378 do not use to connect stage-to-stage!
379 """
380 i_ready = nxt.i_ready if direct else nxt.i_ready_test
381 o_data = fn(nxt.o_data) if fn is not None else nxt.o_data
382 return [nxt.o_valid.eq(self.o_valid),
383 self.i_ready.eq(i_ready),
384 eq(o_data, self.o_data),
385 ]
386
387 def elaborate(self, platform):
388 m = Module()
389 m.d.comb += self.trigger.eq(self.i_ready_test & self.o_valid)
390 return m
391
392 def __iter__(self):
393 yield self.i_ready
394 yield self.o_valid
395 if hasattr(self.o_data, "ports"):
396 yield from self.o_data.ports()
397 elif isinstance(self.o_data, Sequence):
398 yield from self.o_data
399 else:
400 yield self.o_data
401
402 def ports(self):
403 return list(self)
404
405
406 class Visitor2:
407 """ a helper class for iterating twin-argument compound data structures.
408
409 Record is a special (unusual, recursive) case, where the input may be
410 specified as a dictionary (which may contain further dictionaries,
411 recursively), where the field names of the dictionary must match
412 the Record's field spec. Alternatively, an object with the same
413 member names as the Record may be assigned: it does not have to
414 *be* a Record.
415
416 ArrayProxy is also special-cased, it's a bit messy: whilst ArrayProxy
417 has an eq function, the object being assigned to it (e.g. a python
418 object) might not. despite the *input* having an eq function,
419 that doesn't help us, because it's the *ArrayProxy* that's being
420 assigned to. so.... we cheat. use the ports() function of the
421 python object, enumerate them, find out the list of Signals that way,
422 and assign them.
423 """
424 def iterator2(self, o, i):
425 if isinstance(o, dict):
426 yield from self.dict_iter2(o, i)
427
428 if not isinstance(o, Sequence):
429 o, i = [o], [i]
430 for (ao, ai) in zip(o, i):
431 #print ("visit", fn, ao, ai)
432 if isinstance(ao, Record):
433 yield from self.record_iter2(ao, ai)
434 elif isinstance(ao, ArrayProxy) and not isinstance(ai, Value):
435 yield from self.arrayproxy_iter2(ao, ai)
436 else:
437 yield (ao, ai)
438
439 def dict_iter2(self, o, i):
440 for (k, v) in o.items():
441 print ("d-iter", v, i[k])
442 yield (v, i[k])
443 return res
444
445 def _not_quite_working_with_all_unit_tests_record_iter2(self, ao, ai):
446 print ("record_iter2", ao, ai, type(ao), type(ai))
447 if isinstance(ai, Value):
448 if isinstance(ao, Sequence):
449 ao, ai = [ao], [ai]
450 for o, i in zip(ao, ai):
451 yield (o, i)
452 return
453 for idx, (field_name, field_shape, _) in enumerate(ao.layout):
454 if isinstance(field_shape, Layout):
455 val = ai.fields
456 else:
457 val = ai
458 if hasattr(val, field_name): # check for attribute
459 val = getattr(val, field_name)
460 else:
461 val = val[field_name] # dictionary-style specification
462 yield from self.iterator2(ao.fields[field_name], val)
463
464 def record_iter2(self, ao, ai):
465 for idx, (field_name, field_shape, _) in enumerate(ao.layout):
466 if isinstance(field_shape, Layout):
467 val = ai.fields
468 else:
469 val = ai
470 if hasattr(val, field_name): # check for attribute
471 val = getattr(val, field_name)
472 else:
473 val = val[field_name] # dictionary-style specification
474 yield from self.iterator2(ao.fields[field_name], val)
475
476 def arrayproxy_iter2(self, ao, ai):
477 for p in ai.ports():
478 op = getattr(ao, p.name)
479 print ("arrayproxy - p", p, p.name)
480 yield from self.iterator2(op, p)
481
482
483 class Visitor:
484 """ a helper class for iterating single-argument compound data structures.
485 similar to Visitor2.
486 """
487 def iterate(self, i):
488 """ iterate a compound structure recursively using yield
489 """
490 if not isinstance(i, Sequence):
491 i = [i]
492 for ai in i:
493 #print ("iterate", ai)
494 if isinstance(ai, Record):
495 #print ("record", list(ai.layout))
496 yield from self.record_iter(ai)
497 elif isinstance(ai, ArrayProxy) and not isinstance(ai, Value):
498 yield from self.array_iter(ai)
499 else:
500 yield ai
501
502 def record_iter(self, ai):
503 for idx, (field_name, field_shape, _) in enumerate(ai.layout):
504 if isinstance(field_shape, Layout):
505 val = ai.fields
506 else:
507 val = ai
508 if hasattr(val, field_name): # check for attribute
509 val = getattr(val, field_name)
510 else:
511 val = val[field_name] # dictionary-style specification
512 #print ("recidx", idx, field_name, field_shape, val)
513 yield from self.iterate(val)
514
515 def array_iter(self, ai):
516 for p in ai.ports():
517 yield from self.iterate(p)
518
519
520 def eq(o, i):
521 """ makes signals equal: a helper routine which identifies if it is being
522 passed a list (or tuple) of objects, or signals, or Records, and calls
523 the objects' eq function.
524 """
525 res = []
526 for (ao, ai) in Visitor2().iterator2(o, i):
527 rres = ao.eq(ai)
528 if not isinstance(rres, Sequence):
529 rres = [rres]
530 res += rres
531 return res
532
533
534 def shape(i):
535 #print ("shape", i)
536 r = 0
537 for part in list(i):
538 #print ("shape?", part)
539 s, _ = part.shape()
540 r += s
541 return r, False
542
543
544 def cat(i):
545 """ flattens a compound structure recursively using Cat
546 """
547 from nmigen.tools import flatten
548 #res = list(flatten(i)) # works (as of nmigen commit f22106e5) HOWEVER...
549 res = list(Visitor().iterate(i)) # needed because input may be a sequence
550 return Cat(*res)
551
552
553 class StageCls(metaclass=ABCMeta):
554 """ Class-based "Stage" API. requires instantiation (after derivation)
555
556 see "Stage API" above.. Note: python does *not* require derivation
557 from this class. All that is required is that the pipelines *have*
558 the functions listed in this class. Derivation from this class
559 is therefore merely a "courtesy" to maintainers.
560 """
561 @abstractmethod
562 def ispec(self): pass # REQUIRED
563 @abstractmethod
564 def ospec(self): pass # REQUIRED
565 #@abstractmethod
566 #def setup(self, m, i): pass # OPTIONAL
567 @abstractmethod
568 def process(self, i): pass # REQUIRED
569
570
571 class Stage(metaclass=ABCMeta):
572 """ Static "Stage" API. does not require instantiation (after derivation)
573
574 see "Stage API" above. Note: python does *not* require derivation
575 from this class. All that is required is that the pipelines *have*
576 the functions listed in this class. Derivation from this class
577 is therefore merely a "courtesy" to maintainers.
578 """
579 @staticmethod
580 @abstractmethod
581 def ispec(): pass
582
583 @staticmethod
584 @abstractmethod
585 def ospec(): pass
586
587 #@staticmethod
588 #@abstractmethod
589 #def setup(m, i): pass
590
591 @staticmethod
592 @abstractmethod
593 def process(i): pass
594
595
596 class RecordBasedStage(Stage):
597 """ convenience class which provides a Records-based layout.
598 honestly it's a lot easier just to create a direct Records-based
599 class (see ExampleAddRecordStage)
600 """
601 def __init__(self, in_shape, out_shape, processfn, setupfn=None):
602 self.in_shape = in_shape
603 self.out_shape = out_shape
604 self.__process = processfn
605 self.__setup = setupfn
606 def ispec(self): return Record(self.in_shape)
607 def ospec(self): return Record(self.out_shape)
608 def process(seif, i): return self.__process(i)
609 def setup(seif, m, i): return self.__setup(m, i)
610
611
612 class StageChain(StageCls):
613 """ pass in a list of stages, and they will automatically be
614 chained together via their input and output specs into a
615 combinatorial chain.
616
617 the end result basically conforms to the exact same Stage API.
618
619 * input to this class will be the input of the first stage
620 * output of first stage goes into input of second
621 * output of second goes into input into third (etc. etc.)
622 * the output of this class will be the output of the last stage
623 """
624 def __init__(self, chain, specallocate=False):
625 self.chain = chain
626 self.specallocate = specallocate
627
628 def ispec(self):
629 return _spec(self.chain[0].ispec, "chainin")
630
631 def ospec(self):
632 return _spec(self.chain[-1].ospec, "chainout")
633
634 def _specallocate_setup(self, m, i):
635 for (idx, c) in enumerate(self.chain):
636 if hasattr(c, "setup"):
637 c.setup(m, i) # stage may have some module stuff
638 ofn = self.chain[idx].ospec # last assignment survives
639 o = _spec(ofn, 'chainin%d' % idx)
640 m.d.comb += eq(o, c.process(i)) # process input into "o"
641 if idx == len(self.chain)-1:
642 break
643 i = self.chain[idx+1].ispec # new input on next loop
644 i = _spec(ifn, 'chainin%d' % idx+1)
645 m.d.comb += eq(i, o) # assign to next input
646 return o # last loop is the output
647
648 def _noallocate_setup(self, m, i):
649 for (idx, c) in enumerate(self.chain):
650 if hasattr(c, "setup"):
651 c.setup(m, i) # stage may have some module stuff
652 i = o = c.process(i) # store input into "o"
653 return o # last loop is the output
654
655 def setup(self, m, i):
656 if self.specallocate:
657 self.o = self._specallocate_setup(m, i)
658 else:
659 self.o = self._noallocate_setup(m, i)
660
661 def process(self, i):
662 return self.o # conform to Stage API: return last-loop output
663
664
665 class ControlBase(Elaboratable):
666 """ Common functions for Pipeline API
667 """
668 def __init__(self, stage=None, in_multi=None, stage_ctl=False):
669 """ Base class containing ready/valid/data to previous and next stages
670
671 * p: contains ready/valid to the previous stage
672 * n: contains ready/valid to the next stage
673
674 Except when calling Controlbase.connect(), user must also:
675 * add i_data member to PrevControl (p) and
676 * add o_data member to NextControl (n)
677 """
678 self.stage = stage
679
680 # set up input and output IO ACK (prev/next ready/valid)
681 self.p = PrevControl(in_multi, stage_ctl)
682 self.n = NextControl(stage_ctl)
683
684 # set up the input and output data
685 if stage is not None:
686 self.p.i_data = _spec(stage.ispec, "i_data") # input type
687 self.n.o_data = _spec(stage.ospec, "o_data") # output type
688
689 def connect_to_next(self, nxt):
690 """ helper function to connect to the next stage data/valid/ready.
691 """
692 return self.n.connect_to_next(nxt.p)
693
694 def _connect_in(self, prev):
695 """ internal helper function to connect stage to an input source.
696 do not use to connect stage-to-stage!
697 """
698 return self.p._connect_in(prev.p)
699
700 def _connect_out(self, nxt):
701 """ internal helper function to connect stage to an output source.
702 do not use to connect stage-to-stage!
703 """
704 return self.n._connect_out(nxt.n)
705
706 def connect(self, pipechain):
707 """ connects a chain (list) of Pipeline instances together and
708 links them to this ControlBase instance:
709
710 in <----> self <---> out
711 | ^
712 v |
713 [pipe1, pipe2, pipe3, pipe4]
714 | ^ | ^ | ^
715 v | v | v |
716 out---in out--in out---in
717
718 Also takes care of allocating i_data/o_data, by looking up
719 the data spec for each end of the pipechain. i.e It is NOT
720 necessary to allocate self.p.i_data or self.n.o_data manually:
721 this is handled AUTOMATICALLY, here.
722
723 Basically this function is the direct equivalent of StageChain,
724 except that unlike StageChain, the Pipeline logic is followed.
725
726 Just as StageChain presents an object that conforms to the
727 Stage API from a list of objects that also conform to the
728 Stage API, an object that calls this Pipeline connect function
729 has the exact same pipeline API as the list of pipline objects
730 it is called with.
731
732 Thus it becomes possible to build up larger chains recursively.
733 More complex chains (multi-input, multi-output) will have to be
734 done manually.
735 """
736 eqs = [] # collated list of assignment statements
737
738 # connect inter-chain
739 for i in range(len(pipechain)-1):
740 pipe1 = pipechain[i]
741 pipe2 = pipechain[i+1]
742 eqs += pipe1.connect_to_next(pipe2)
743
744 # connect front of chain to ourselves
745 front = pipechain[0]
746 self.p.i_data = _spec(front.stage.ispec, "chainin")
747 eqs += front._connect_in(self)
748
749 # connect end of chain to ourselves
750 end = pipechain[-1]
751 self.n.o_data = _spec(end.stage.ospec, "chainout")
752 eqs += end._connect_out(self)
753
754 return eqs
755
756 def _postprocess(self, i): # XXX DISABLED
757 return i # RETURNS INPUT
758 if hasattr(self.stage, "postprocess"):
759 return self.stage.postprocess(i)
760 return i
761
762 def set_input(self, i):
763 """ helper function to set the input data
764 """
765 return eq(self.p.i_data, i)
766
767 def __iter__(self):
768 yield from self.p
769 yield from self.n
770
771 def ports(self):
772 return list(self)
773
774 def elaborate(self, platform):
775 """ handles case where stage has dynamic ready/valid functions
776 """
777 m = Module()
778 m.submodules.p = self.p
779 m.submodules.n = self.n
780
781 if self.stage is not None and hasattr(self.stage, "setup"):
782 self.stage.setup(m, self.p.i_data)
783
784 if not self.p.stage_ctl:
785 return m
786
787 # intercept the previous (outgoing) "ready", combine with stage ready
788 m.d.comb += self.p.s_o_ready.eq(self.p._o_ready & self.stage.d_ready)
789
790 # intercept the next (incoming) "ready" and combine it with data valid
791 sdv = self.stage.d_valid(self.n.i_ready)
792 m.d.comb += self.n.d_valid.eq(self.n.i_ready & sdv)
793
794 return m
795
796
797 class BufferedHandshake(ControlBase):
798 """ buffered pipeline stage. data and strobe signals travel in sync.
799 if ever the input is ready and the output is not, processed data
800 is shunted in a temporary register.
801
802 Argument: stage. see Stage API above
803
804 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
805 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
806 stage-1 p.i_data >>in stage n.o_data out>> stage+1
807 | |
808 process --->----^
809 | |
810 +-- r_data ->-+
811
812 input data p.i_data is read (only), is processed and goes into an
813 intermediate result store [process()]. this is updated combinatorially.
814
815 in a non-stall condition, the intermediate result will go into the
816 output (update_output). however if ever there is a stall, it goes
817 into r_data instead [update_buffer()].
818
819 when the non-stall condition is released, r_data is the first
820 to be transferred to the output [flush_buffer()], and the stall
821 condition cleared.
822
823 on the next cycle (as long as stall is not raised again) the
824 input may begin to be processed and transferred directly to output.
825 """
826
827 def elaborate(self, platform):
828 self.m = ControlBase.elaborate(self, platform)
829
830 result = _spec(self.stage.ospec, "r_tmp")
831 r_data = _spec(self.stage.ospec, "r_data")
832
833 # establish some combinatorial temporaries
834 o_n_validn = Signal(reset_less=True)
835 n_i_ready = Signal(reset_less=True, name="n_i_rdy_data")
836 nir_por = Signal(reset_less=True)
837 nir_por_n = Signal(reset_less=True)
838 p_i_valid = Signal(reset_less=True)
839 nir_novn = Signal(reset_less=True)
840 nirn_novn = Signal(reset_less=True)
841 por_pivn = Signal(reset_less=True)
842 npnn = Signal(reset_less=True)
843 self.m.d.comb += [p_i_valid.eq(self.p.i_valid_test),
844 o_n_validn.eq(~self.n.o_valid),
845 n_i_ready.eq(self.n.i_ready_test),
846 nir_por.eq(n_i_ready & self.p._o_ready),
847 nir_por_n.eq(n_i_ready & ~self.p._o_ready),
848 nir_novn.eq(n_i_ready | o_n_validn),
849 nirn_novn.eq(~n_i_ready & o_n_validn),
850 npnn.eq(nir_por | nirn_novn),
851 por_pivn.eq(self.p._o_ready & ~p_i_valid)
852 ]
853
854 # store result of processing in combinatorial temporary
855 self.m.d.comb += eq(result, self.stage.process(self.p.i_data))
856
857 # if not in stall condition, update the temporary register
858 with self.m.If(self.p.o_ready): # not stalled
859 self.m.d.sync += eq(r_data, result) # update buffer
860
861 # data pass-through conditions
862 with self.m.If(npnn):
863 o_data = self._postprocess(result)
864 self.m.d.sync += [self.n.o_valid.eq(p_i_valid), # valid if p_valid
865 eq(self.n.o_data, o_data), # update output
866 ]
867 # buffer flush conditions (NOTE: can override data passthru conditions)
868 with self.m.If(nir_por_n): # not stalled
869 # Flush the [already processed] buffer to the output port.
870 o_data = self._postprocess(r_data)
871 self.m.d.sync += [self.n.o_valid.eq(1), # reg empty
872 eq(self.n.o_data, o_data), # flush buffer
873 ]
874 # output ready conditions
875 self.m.d.sync += self.p._o_ready.eq(nir_novn | por_pivn)
876
877 return self.m
878
879
880 class SimpleHandshake(ControlBase):
881 """ simple handshake control. data and strobe signals travel in sync.
882 implements the protocol used by Wishbone and AXI4.
883
884 Argument: stage. see Stage API above
885
886 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
887 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
888 stage-1 p.i_data >>in stage n.o_data out>> stage+1
889 | |
890 +--process->--^
891 Truth Table
892
893 Inputs Temporary Output Data
894 ------- ---------- ----- ----
895 P P N N PiV& ~NiR& N P
896 i o i o PoR NoV o o
897 V R R V V R
898
899 ------- - - - -
900 0 0 0 0 0 0 >0 0 reg
901 0 0 0 1 0 1 >1 0 reg
902 0 0 1 0 0 0 0 1 process(i_data)
903 0 0 1 1 0 0 0 1 process(i_data)
904 ------- - - - -
905 0 1 0 0 0 0 >0 0 reg
906 0 1 0 1 0 1 >1 0 reg
907 0 1 1 0 0 0 0 1 process(i_data)
908 0 1 1 1 0 0 0 1 process(i_data)
909 ------- - - - -
910 1 0 0 0 0 0 >0 0 reg
911 1 0 0 1 0 1 >1 0 reg
912 1 0 1 0 0 0 0 1 process(i_data)
913 1 0 1 1 0 0 0 1 process(i_data)
914 ------- - - - -
915 1 1 0 0 1 0 1 0 process(i_data)
916 1 1 0 1 1 1 1 0 process(i_data)
917 1 1 1 0 1 0 1 1 process(i_data)
918 1 1 1 1 1 0 1 1 process(i_data)
919 ------- - - - -
920 """
921
922 def elaborate(self, platform):
923 self.m = m = ControlBase.elaborate(self, platform)
924
925 r_busy = Signal()
926 result = _spec(self.stage.ospec, "r_tmp")
927
928 # establish some combinatorial temporaries
929 n_i_ready = Signal(reset_less=True, name="n_i_rdy_data")
930 p_i_valid_p_o_ready = Signal(reset_less=True)
931 p_i_valid = Signal(reset_less=True)
932 m.d.comb += [p_i_valid.eq(self.p.i_valid_test),
933 n_i_ready.eq(self.n.i_ready_test),
934 p_i_valid_p_o_ready.eq(p_i_valid & self.p.o_ready),
935 ]
936
937 # store result of processing in combinatorial temporary
938 m.d.comb += eq(result, self.stage.process(self.p.i_data))
939
940 # previous valid and ready
941 with m.If(p_i_valid_p_o_ready):
942 o_data = self._postprocess(result)
943 m.d.sync += [r_busy.eq(1), # output valid
944 eq(self.n.o_data, o_data), # update output
945 ]
946 # previous invalid or not ready, however next is accepting
947 with m.Elif(n_i_ready):
948 o_data = self._postprocess(result)
949 m.d.sync += [eq(self.n.o_data, o_data)]
950 # TODO: could still send data here (if there was any)
951 #m.d.sync += self.n.o_valid.eq(0) # ...so set output invalid
952 m.d.sync += r_busy.eq(0) # ...so set output invalid
953
954 m.d.comb += self.n.o_valid.eq(r_busy)
955 # if next is ready, so is previous
956 m.d.comb += self.p._o_ready.eq(n_i_ready)
957
958 return self.m
959
960
961 class UnbufferedPipeline(ControlBase):
962 """ A simple pipeline stage with single-clock synchronisation
963 and two-way valid/ready synchronised signalling.
964
965 Note that a stall in one stage will result in the entire pipeline
966 chain stalling.
967
968 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
969 travel synchronously with the data: the valid/ready signalling
970 combines in a *combinatorial* fashion. Therefore, a long pipeline
971 chain will lengthen propagation delays.
972
973 Argument: stage. see Stage API, above
974
975 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
976 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
977 stage-1 p.i_data >>in stage n.o_data out>> stage+1
978 | |
979 r_data result
980 | |
981 +--process ->-+
982
983 Attributes:
984 -----------
985 p.i_data : StageInput, shaped according to ispec
986 The pipeline input
987 p.o_data : StageOutput, shaped according to ospec
988 The pipeline output
989 r_data : input_shape according to ispec
990 A temporary (buffered) copy of a prior (valid) input.
991 This is HELD if the output is not ready. It is updated
992 SYNCHRONOUSLY.
993 result: output_shape according to ospec
994 The output of the combinatorial logic. it is updated
995 COMBINATORIALLY (no clock dependence).
996
997 Truth Table
998
999 Inputs Temp Output Data
1000 ------- - ----- ----
1001 P P N N ~NiR& N P
1002 i o i o NoV o o
1003 V R R V V R
1004
1005 ------- - - -
1006 0 0 0 0 0 0 1 reg
1007 0 0 0 1 1 1 0 reg
1008 0 0 1 0 0 0 1 reg
1009 0 0 1 1 0 0 1 reg
1010 ------- - - -
1011 0 1 0 0 0 0 1 reg
1012 0 1 0 1 1 1 0 reg
1013 0 1 1 0 0 0 1 reg
1014 0 1 1 1 0 0 1 reg
1015 ------- - - -
1016 1 0 0 0 0 1 1 reg
1017 1 0 0 1 1 1 0 reg
1018 1 0 1 0 0 1 1 reg
1019 1 0 1 1 0 1 1 reg
1020 ------- - - -
1021 1 1 0 0 0 1 1 process(i_data)
1022 1 1 0 1 1 1 0 process(i_data)
1023 1 1 1 0 0 1 1 process(i_data)
1024 1 1 1 1 0 1 1 process(i_data)
1025 ------- - - -
1026
1027 Note: PoR is *NOT* involved in the above decision-making.
1028 """
1029
1030 def elaborate(self, platform):
1031 self.m = m = ControlBase.elaborate(self, platform)
1032
1033 data_valid = Signal() # is data valid or not
1034 r_data = _spec(self.stage.ospec, "r_tmp") # output type
1035
1036 # some temporaries
1037 p_i_valid = Signal(reset_less=True)
1038 pv = Signal(reset_less=True)
1039 buf_full = Signal(reset_less=True)
1040 m.d.comb += p_i_valid.eq(self.p.i_valid_test)
1041 m.d.comb += pv.eq(self.p.i_valid & self.p.o_ready)
1042 m.d.comb += buf_full.eq(~self.n.i_ready_test & data_valid)
1043
1044 m.d.comb += self.n.o_valid.eq(data_valid)
1045 m.d.comb += self.p._o_ready.eq(~data_valid | self.n.i_ready_test)
1046 m.d.sync += data_valid.eq(p_i_valid | buf_full)
1047
1048 with m.If(pv):
1049 m.d.sync += eq(r_data, self.stage.process(self.p.i_data))
1050 o_data = self._postprocess(r_data)
1051 m.d.comb += eq(self.n.o_data, o_data)
1052
1053 return self.m
1054
1055 class UnbufferedPipeline2(ControlBase):
1056 """ A simple pipeline stage with single-clock synchronisation
1057 and two-way valid/ready synchronised signalling.
1058
1059 Note that a stall in one stage will result in the entire pipeline
1060 chain stalling.
1061
1062 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
1063 travel synchronously with the data: the valid/ready signalling
1064 combines in a *combinatorial* fashion. Therefore, a long pipeline
1065 chain will lengthen propagation delays.
1066
1067 Argument: stage. see Stage API, above
1068
1069 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
1070 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
1071 stage-1 p.i_data >>in stage n.o_data out>> stage+1
1072 | | |
1073 +- process-> buf <-+
1074 Attributes:
1075 -----------
1076 p.i_data : StageInput, shaped according to ispec
1077 The pipeline input
1078 p.o_data : StageOutput, shaped according to ospec
1079 The pipeline output
1080 buf : output_shape according to ospec
1081 A temporary (buffered) copy of a valid output
1082 This is HELD if the output is not ready. It is updated
1083 SYNCHRONOUSLY.
1084
1085 Inputs Temp Output Data
1086 ------- - -----
1087 P P N N ~NiR& N P (buf_full)
1088 i o i o NoV o o
1089 V R R V V R
1090
1091 ------- - - -
1092 0 0 0 0 0 0 1 process(i_data)
1093 0 0 0 1 1 1 0 reg (odata, unchanged)
1094 0 0 1 0 0 0 1 process(i_data)
1095 0 0 1 1 0 0 1 process(i_data)
1096 ------- - - -
1097 0 1 0 0 0 0 1 process(i_data)
1098 0 1 0 1 1 1 0 reg (odata, unchanged)
1099 0 1 1 0 0 0 1 process(i_data)
1100 0 1 1 1 0 0 1 process(i_data)
1101 ------- - - -
1102 1 0 0 0 0 1 1 process(i_data)
1103 1 0 0 1 1 1 0 reg (odata, unchanged)
1104 1 0 1 0 0 1 1 process(i_data)
1105 1 0 1 1 0 1 1 process(i_data)
1106 ------- - - -
1107 1 1 0 0 0 1 1 process(i_data)
1108 1 1 0 1 1 1 0 reg (odata, unchanged)
1109 1 1 1 0 0 1 1 process(i_data)
1110 1 1 1 1 0 1 1 process(i_data)
1111 ------- - - -
1112
1113 Note: PoR is *NOT* involved in the above decision-making.
1114 """
1115
1116 def elaborate(self, platform):
1117 self.m = m = ControlBase.elaborate(self, platform)
1118
1119 buf_full = Signal() # is data valid or not
1120 buf = _spec(self.stage.ospec, "r_tmp") # output type
1121
1122 # some temporaries
1123 p_i_valid = Signal(reset_less=True)
1124 m.d.comb += p_i_valid.eq(self.p.i_valid_test)
1125
1126 m.d.comb += self.n.o_valid.eq(buf_full | p_i_valid)
1127 m.d.comb += self.p._o_ready.eq(~buf_full)
1128 m.d.sync += buf_full.eq(~self.n.i_ready_test & self.n.o_valid)
1129
1130 o_data = Mux(buf_full, buf, self.stage.process(self.p.i_data))
1131 o_data = self._postprocess(o_data)
1132 m.d.comb += eq(self.n.o_data, o_data)
1133 m.d.sync += eq(buf, self.n.o_data)
1134
1135 return self.m
1136
1137
1138 class PassThroughStage(StageCls):
1139 """ a pass-through stage which has its input data spec equal to its output,
1140 and "passes through" its data from input to output.
1141 """
1142 def __init__(self, iospecfn):
1143 self.iospecfn = iospecfn
1144 def ispec(self): return self.iospecfn()
1145 def ospec(self): return self.iospecfn()
1146 def process(self, i): return i
1147
1148
1149 class PassThroughHandshake(ControlBase):
1150 """ A control block that delays by one clock cycle.
1151
1152 Inputs Temporary Output Data
1153 ------- ------------------ ----- ----
1154 P P N N PiV& PiV| NiR| pvr N P (pvr)
1155 i o i o PoR ~PoR ~NoV o o
1156 V R R V V R
1157
1158 ------- - - - - - -
1159 0 0 0 0 0 1 1 0 1 1 odata (unchanged)
1160 0 0 0 1 0 1 0 0 1 0 odata (unchanged)
1161 0 0 1 0 0 1 1 0 1 1 odata (unchanged)
1162 0 0 1 1 0 1 1 0 1 1 odata (unchanged)
1163 ------- - - - - - -
1164 0 1 0 0 0 0 1 0 0 1 odata (unchanged)
1165 0 1 0 1 0 0 0 0 0 0 odata (unchanged)
1166 0 1 1 0 0 0 1 0 0 1 odata (unchanged)
1167 0 1 1 1 0 0 1 0 0 1 odata (unchanged)
1168 ------- - - - - - -
1169 1 0 0 0 0 1 1 1 1 1 process(in)
1170 1 0 0 1 0 1 0 0 1 0 odata (unchanged)
1171 1 0 1 0 0 1 1 1 1 1 process(in)
1172 1 0 1 1 0 1 1 1 1 1 process(in)
1173 ------- - - - - - -
1174 1 1 0 0 1 1 1 1 1 1 process(in)
1175 1 1 0 1 1 1 0 0 1 0 odata (unchanged)
1176 1 1 1 0 1 1 1 1 1 1 process(in)
1177 1 1 1 1 1 1 1 1 1 1 process(in)
1178 ------- - - - - - -
1179
1180 """
1181
1182 def elaborate(self, platform):
1183 self.m = m = ControlBase.elaborate(self, platform)
1184
1185 r_data = _spec(self.stage.ospec, "r_tmp") # output type
1186
1187 # temporaries
1188 p_i_valid = Signal(reset_less=True)
1189 pvr = Signal(reset_less=True)
1190 m.d.comb += p_i_valid.eq(self.p.i_valid_test)
1191 m.d.comb += pvr.eq(p_i_valid & self.p.o_ready)
1192
1193 m.d.comb += self.p.o_ready.eq(~self.n.o_valid | self.n.i_ready_test)
1194 m.d.sync += self.n.o_valid.eq(p_i_valid | ~self.p.o_ready)
1195
1196 odata = Mux(pvr, self.stage.process(self.p.i_data), r_data)
1197 m.d.sync += eq(r_data, odata)
1198 r_data = self._postprocess(r_data)
1199 m.d.comb += eq(self.n.o_data, r_data)
1200
1201 return m
1202
1203
1204 class RegisterPipeline(UnbufferedPipeline):
1205 """ A pipeline stage that delays by one clock cycle, creating a
1206 sync'd latch out of o_data and o_valid as an indirect byproduct
1207 of using PassThroughStage
1208 """
1209 def __init__(self, iospecfn):
1210 UnbufferedPipeline.__init__(self, PassThroughStage(iospecfn))
1211
1212
1213 class FIFOControl(ControlBase):
1214 """ FIFO Control. Uses SyncFIFO to store data, coincidentally
1215 happens to have same valid/ready signalling as Stage API.
1216
1217 i_data -> fifo.din -> FIFO -> fifo.dout -> o_data
1218 """
1219
1220 def __init__(self, depth, stage, in_multi=None, stage_ctl=False,
1221 fwft=True, buffered=False, pipe=False):
1222 """ FIFO Control
1223
1224 * depth: number of entries in the FIFO
1225 * stage: data processing block
1226 * fwft : first word fall-thru mode (non-fwft introduces delay)
1227 * buffered: use buffered FIFO (introduces extra cycle delay)
1228
1229 NOTE 1: FPGAs may have trouble with the defaults for SyncFIFO
1230 (fwft=True, buffered=False)
1231
1232 NOTE 2: i_data *must* have a shape function. it can therefore
1233 be a Signal, or a Record, or a RecordObject.
1234
1235 data is processed (and located) as follows:
1236
1237 self.p self.stage temp fn temp fn temp fp self.n
1238 i_data->process()->result->cat->din.FIFO.dout->cat(o_data)
1239
1240 yes, really: cat produces a Cat() which can be assigned to.
1241 this is how the FIFO gets de-catted without needing a de-cat
1242 function
1243 """
1244
1245 assert not (fwft and buffered), "buffered cannot do fwft"
1246 if buffered:
1247 depth += 1
1248 self.fwft = fwft
1249 self.buffered = buffered
1250 self.pipe = pipe
1251 self.fdepth = depth
1252 ControlBase.__init__(self, stage, in_multi, stage_ctl)
1253
1254 def elaborate(self, platform):
1255 self.m = m = ControlBase.elaborate(self, platform)
1256
1257 # make a FIFO with a signal of equal width to the o_data.
1258 (fwidth, _) = shape(self.n.o_data)
1259 if self.buffered:
1260 fifo = SyncFIFOBuffered(fwidth, self.fdepth)
1261 else:
1262 fifo = Queue(fwidth, self.fdepth, fwft=self.fwft, pipe=self.pipe)
1263 m.submodules.fifo = fifo
1264
1265 # store result of processing in combinatorial temporary
1266 result = _spec(self.stage.ospec, "r_temp")
1267 m.d.comb += eq(result, self.stage.process(self.p.i_data))
1268
1269 # connect previous rdy/valid/data - do cat on i_data
1270 # NOTE: cannot do the PrevControl-looking trick because
1271 # of need to process the data. shaaaame....
1272 m.d.comb += [fifo.we.eq(self.p.i_valid_test),
1273 self.p.o_ready.eq(fifo.writable),
1274 eq(fifo.din, cat(result)),
1275 ]
1276
1277 # connect next rdy/valid/data - do cat on o_data
1278 connections = [self.n.o_valid.eq(fifo.readable),
1279 fifo.re.eq(self.n.i_ready_test),
1280 ]
1281 if self.fwft or self.buffered:
1282 m.d.comb += connections
1283 else:
1284 m.d.sync += connections # unbuffered fwft mode needs sync
1285 o_data = cat(self.n.o_data).eq(fifo.dout)
1286 o_data = self._postprocess(o_data)
1287 m.d.comb += o_data
1288
1289 return m
1290
1291
1292 # aka "RegStage".
1293 class UnbufferedPipeline(FIFOControl):
1294 def __init__(self, stage, in_multi=None, stage_ctl=False):
1295 FIFOControl.__init__(self, 1, stage, in_multi, stage_ctl,
1296 fwft=True, pipe=False)
1297
1298 # aka "BreakReadyStage" XXX had to set fwft=True to get it to work
1299 class PassThroughHandshake(FIFOControl):
1300 def __init__(self, stage, in_multi=None, stage_ctl=False):
1301 FIFOControl.__init__(self, 1, stage, in_multi, stage_ctl,
1302 fwft=True, pipe=True)
1303
1304 # this is *probably* BufferedHandshake, although test #997 now succeeds.
1305 class BufferedHandshake(FIFOControl):
1306 def __init__(self, stage, in_multi=None, stage_ctl=False):
1307 FIFOControl.__init__(self, 2, stage, in_multi, stage_ctl,
1308 fwft=True, pipe=False)
1309
1310
1311 """
1312 # this is *probably* SimpleHandshake (note: memory cell size=0)
1313 class SimpleHandshake(FIFOControl):
1314 def __init__(self, stage, in_multi=None, stage_ctl=False):
1315 FIFOControl.__init__(self, 0, stage, in_multi, stage_ctl,
1316 fwft=True, pipe=False)
1317 """