move eq, shape and cat to nmoperator.py
[ieee754fpu.git] / src / add / singlepipe.py
1 """ Pipeline and BufferedHandshake implementation, conforming to the same API.
2 For multi-input and multi-output variants, see multipipe.
3
4 Associated development bugs:
5 * http://bugs.libre-riscv.org/show_bug.cgi?id=64
6 * http://bugs.libre-riscv.org/show_bug.cgi?id=57
7
8 eq:
9 --
10
11 a strategically very important function that is identical in function
12 to nmigen's Signal.eq function, except it may take objects, or a list
13 of objects, or a tuple of objects, and where objects may also be
14 Records.
15
16 Stage API:
17 ---------
18
19 stage requires compliance with a strict API that may be
20 implemented in several means, including as a static class.
21 the methods of a stage instance must be as follows:
22
23 * ispec() - Input data format specification
24 returns an object or a list or tuple of objects, or
25 a Record, each object having an "eq" function which
26 takes responsibility for copying by assignment all
27 sub-objects
28 * ospec() - Output data format specification
29 requirements as for ospec
30 * process(m, i) - Processes an ispec-formatted object
31 returns a combinatorial block of a result that
32 may be assigned to the output, by way of the "eq"
33 function
34 * setup(m, i) - Optional function for setting up submodules
35 may be used for more complex stages, to link
36 the input (i) to submodules. must take responsibility
37 for adding those submodules to the module (m).
38 the submodules must be combinatorial blocks and
39 must have their inputs and output linked combinatorially.
40
41 Both StageCls (for use with non-static classes) and Stage (for use
42 by static classes) are abstract classes from which, for convenience
43 and as a courtesy to other developers, anything conforming to the
44 Stage API may *choose* to derive.
45
46 StageChain:
47 ----------
48
49 A useful combinatorial wrapper around stages that chains them together
50 and then presents a Stage-API-conformant interface. By presenting
51 the same API as the stages it wraps, it can clearly be used recursively.
52
53 RecordBasedStage:
54 ----------------
55
56 A convenience class that takes an input shape, output shape, a
57 "processing" function and an optional "setup" function. Honestly
58 though, there's not much more effort to just... create a class
59 that returns a couple of Records (see ExampleAddRecordStage in
60 examples).
61
62 PassThroughStage:
63 ----------------
64
65 A convenience class that takes a single function as a parameter,
66 that is chain-called to create the exact same input and output spec.
67 It has a process() function that simply returns its input.
68
69 Instances of this class are completely redundant if handed to
70 StageChain, however when passed to UnbufferedPipeline they
71 can be used to introduce a single clock delay.
72
73 ControlBase:
74 -----------
75
76 The base class for pipelines. Contains previous and next ready/valid/data.
77 Also has an extremely useful "connect" function that can be used to
78 connect a chain of pipelines and present the exact same prev/next
79 ready/valid/data API.
80
81 UnbufferedPipeline:
82 ------------------
83
84 A simple stalling clock-synchronised pipeline that has no buffering
85 (unlike BufferedHandshake). Data flows on *every* clock cycle when
86 the conditions are right (this is nominally when the input is valid
87 and the output is ready).
88
89 A stall anywhere along the line will result in a stall back-propagating
90 down the entire chain. The BufferedHandshake by contrast will buffer
91 incoming data, allowing previous stages one clock cycle's grace before
92 also having to stall.
93
94 An advantage of the UnbufferedPipeline over the Buffered one is
95 that the amount of logic needed (number of gates) is greatly
96 reduced (no second set of buffers basically)
97
98 The disadvantage of the UnbufferedPipeline is that the valid/ready
99 logic, if chained together, is *combinatorial*, resulting in
100 progressively larger gate delay.
101
102 PassThroughHandshake:
103 ------------------
104
105 A Control class that introduces a single clock delay, passing its
106 data through unaltered. Unlike RegisterPipeline (which relies
107 on UnbufferedPipeline and PassThroughStage) it handles ready/valid
108 itself.
109
110 RegisterPipeline:
111 ----------------
112
113 A convenience class that, because UnbufferedPipeline introduces a single
114 clock delay, when its stage is a PassThroughStage, it results in a Pipeline
115 stage that, duh, delays its (unmodified) input by one clock cycle.
116
117 BufferedHandshake:
118 ----------------
119
120 nmigen implementation of buffered pipeline stage, based on zipcpu:
121 https://zipcpu.com/blog/2017/08/14/strategies-for-pipelining.html
122
123 this module requires quite a bit of thought to understand how it works
124 (and why it is needed in the first place). reading the above is
125 *strongly* recommended.
126
127 unlike john dawson's IEEE754 FPU STB/ACK signalling, which requires
128 the STB / ACK signals to raise and lower (on separate clocks) before
129 data may proceeed (thus only allowing one piece of data to proceed
130 on *ALTERNATE* cycles), the signalling here is a true pipeline
131 where data will flow on *every* clock when the conditions are right.
132
133 input acceptance conditions are when:
134 * incoming previous-stage strobe (p.valid_i) is HIGH
135 * outgoing previous-stage ready (p.ready_o) is LOW
136
137 output transmission conditions are when:
138 * outgoing next-stage strobe (n.valid_o) is HIGH
139 * outgoing next-stage ready (n.ready_i) is LOW
140
141 the tricky bit is when the input has valid data and the output is not
142 ready to accept it. if it wasn't for the clock synchronisation, it
143 would be possible to tell the input "hey don't send that data, we're
144 not ready". unfortunately, it's not possible to "change the past":
145 the previous stage *has no choice* but to pass on its data.
146
147 therefore, the incoming data *must* be accepted - and stored: that
148 is the responsibility / contract that this stage *must* accept.
149 on the same clock, it's possible to tell the input that it must
150 not send any more data. this is the "stall" condition.
151
152 we now effectively have *two* possible pieces of data to "choose" from:
153 the buffered data, and the incoming data. the decision as to which
154 to process and output is based on whether we are in "stall" or not.
155 i.e. when the next stage is no longer ready, the output comes from
156 the buffer if a stall had previously occurred, otherwise it comes
157 direct from processing the input.
158
159 this allows us to respect a synchronous "travelling STB" with what
160 dan calls a "buffered handshake".
161
162 it's quite a complex state machine!
163
164 SimpleHandshake
165 ---------------
166
167 Synchronised pipeline, Based on:
168 https://github.com/ZipCPU/dbgbus/blob/master/hexbus/rtl/hbdeword.v
169 """
170
171 from nmigen import Signal, Cat, Const, Mux, Module, Value, Elaboratable
172 from nmigen.cli import verilog, rtlil
173 from nmigen.lib.fifo import SyncFIFO, SyncFIFOBuffered
174 from nmigen.hdl.ast import ArrayProxy
175 from nmigen.hdl.rec import Record, Layout
176
177 from abc import ABCMeta, abstractmethod
178 from collections.abc import Sequence, Iterable
179 from collections import OrderedDict
180 from queue import Queue
181 import inspect
182
183 from nmoperator import eq, cat, shape
184
185
186 class Object:
187 def __init__(self):
188 self.fields = OrderedDict()
189
190 def __setattr__(self, k, v):
191 print ("kv", k, v)
192 if (k.startswith('_') or k in ["fields", "name", "src_loc"] or
193 k in dir(Object) or "fields" not in self.__dict__):
194 return object.__setattr__(self, k, v)
195 self.fields[k] = v
196
197 def __getattr__(self, k):
198 if k in self.__dict__:
199 return object.__getattr__(self, k)
200 try:
201 return self.fields[k]
202 except KeyError as e:
203 raise AttributeError(e)
204
205 def __iter__(self):
206 for x in self.fields.values():
207 if isinstance(x, Iterable):
208 yield from x
209 else:
210 yield x
211
212 def eq(self, inp):
213 res = []
214 for (k, o) in self.fields.items():
215 i = getattr(inp, k)
216 print ("eq", o, i)
217 rres = o.eq(i)
218 if isinstance(rres, Sequence):
219 res += rres
220 else:
221 res.append(rres)
222 print (res)
223 return res
224
225 def ports(self):
226 return list(self)
227
228
229 class RecordObject(Record):
230 def __init__(self, layout=None, name=None):
231 Record.__init__(self, layout=layout or [], name=None)
232
233 def __setattr__(self, k, v):
234 #print (dir(Record))
235 if (k.startswith('_') or k in ["fields", "name", "src_loc"] or
236 k in dir(Record) or "fields" not in self.__dict__):
237 return object.__setattr__(self, k, v)
238 self.fields[k] = v
239 #print ("RecordObject setattr", k, v)
240 if isinstance(v, Record):
241 newlayout = {k: (k, v.layout)}
242 elif isinstance(v, Value):
243 newlayout = {k: (k, v.shape())}
244 else:
245 newlayout = {k: (k, shape(v))}
246 self.layout.fields.update(newlayout)
247
248 def __iter__(self):
249 for x in self.fields.values():
250 if isinstance(x, Iterable):
251 yield from x
252 else:
253 yield x
254
255 def ports(self):
256 return list(self)
257
258
259 def _spec(fn, name=None):
260 if name is None:
261 return fn()
262 varnames = dict(inspect.getmembers(fn.__code__))['co_varnames']
263 if 'name' in varnames:
264 return fn(name=name)
265 return fn()
266
267
268 class PrevControl(Elaboratable):
269 """ contains signals that come *from* the previous stage (both in and out)
270 * valid_i: previous stage indicating all incoming data is valid.
271 may be a multi-bit signal, where all bits are required
272 to be asserted to indicate "valid".
273 * ready_o: output to next stage indicating readiness to accept data
274 * data_i : an input - added by the user of this class
275 """
276
277 def __init__(self, i_width=1, stage_ctl=False):
278 self.stage_ctl = stage_ctl
279 self.valid_i = Signal(i_width, name="p_valid_i") # prev >>in self
280 self._ready_o = Signal(name="p_ready_o") # prev <<out self
281 self.data_i = None # XXX MUST BE ADDED BY USER
282 if stage_ctl:
283 self.s_ready_o = Signal(name="p_s_o_rdy") # prev <<out self
284 self.trigger = Signal(reset_less=True)
285
286 @property
287 def ready_o(self):
288 """ public-facing API: indicates (externally) that stage is ready
289 """
290 if self.stage_ctl:
291 return self.s_ready_o # set dynamically by stage
292 return self._ready_o # return this when not under dynamic control
293
294 def _connect_in(self, prev, direct=False, fn=None):
295 """ internal helper function to connect stage to an input source.
296 do not use to connect stage-to-stage!
297 """
298 valid_i = prev.valid_i if direct else prev.valid_i_test
299 data_i = fn(prev.data_i) if fn is not None else prev.data_i
300 return [self.valid_i.eq(valid_i),
301 prev.ready_o.eq(self.ready_o),
302 eq(self.data_i, data_i),
303 ]
304
305 @property
306 def valid_i_test(self):
307 vlen = len(self.valid_i)
308 if vlen > 1:
309 # multi-bit case: valid only when valid_i is all 1s
310 all1s = Const(-1, (len(self.valid_i), False))
311 valid_i = (self.valid_i == all1s)
312 else:
313 # single-bit valid_i case
314 valid_i = self.valid_i
315
316 # when stage indicates not ready, incoming data
317 # must "appear" to be not ready too
318 if self.stage_ctl:
319 valid_i = valid_i & self.s_ready_o
320
321 return valid_i
322
323 def elaborate(self, platform):
324 m = Module()
325 m.d.comb += self.trigger.eq(self.valid_i_test & self.ready_o)
326 return m
327
328 def eq(self, i):
329 return [self.data_i.eq(i.data_i),
330 self.ready_o.eq(i.ready_o),
331 self.valid_i.eq(i.valid_i)]
332
333 def __iter__(self):
334 yield self.valid_i
335 yield self.ready_o
336 if hasattr(self.data_i, "ports"):
337 yield from self.data_i.ports()
338 elif isinstance(self.data_i, Sequence):
339 yield from self.data_i
340 else:
341 yield self.data_i
342
343 def ports(self):
344 return list(self)
345
346
347 class NextControl(Elaboratable):
348 """ contains the signals that go *to* the next stage (both in and out)
349 * valid_o: output indicating to next stage that data is valid
350 * ready_i: input from next stage indicating that it can accept data
351 * data_o : an output - added by the user of this class
352 """
353 def __init__(self, stage_ctl=False):
354 self.stage_ctl = stage_ctl
355 self.valid_o = Signal(name="n_valid_o") # self out>> next
356 self.ready_i = Signal(name="n_ready_i") # self <<in next
357 self.data_o = None # XXX MUST BE ADDED BY USER
358 #if self.stage_ctl:
359 self.d_valid = Signal(reset=1) # INTERNAL (data valid)
360 self.trigger = Signal(reset_less=True)
361
362 @property
363 def ready_i_test(self):
364 if self.stage_ctl:
365 return self.ready_i & self.d_valid
366 return self.ready_i
367
368 def connect_to_next(self, nxt):
369 """ helper function to connect to the next stage data/valid/ready.
370 data/valid is passed *TO* nxt, and ready comes *IN* from nxt.
371 use this when connecting stage-to-stage
372 """
373 return [nxt.valid_i.eq(self.valid_o),
374 self.ready_i.eq(nxt.ready_o),
375 eq(nxt.data_i, self.data_o),
376 ]
377
378 def _connect_out(self, nxt, direct=False, fn=None):
379 """ internal helper function to connect stage to an output source.
380 do not use to connect stage-to-stage!
381 """
382 ready_i = nxt.ready_i if direct else nxt.ready_i_test
383 data_o = fn(nxt.data_o) if fn is not None else nxt.data_o
384 return [nxt.valid_o.eq(self.valid_o),
385 self.ready_i.eq(ready_i),
386 eq(data_o, self.data_o),
387 ]
388
389 def elaborate(self, platform):
390 m = Module()
391 m.d.comb += self.trigger.eq(self.ready_i_test & self.valid_o)
392 return m
393
394 def __iter__(self):
395 yield self.ready_i
396 yield self.valid_o
397 if hasattr(self.data_o, "ports"):
398 yield from self.data_o.ports()
399 elif isinstance(self.data_o, Sequence):
400 yield from self.data_o
401 else:
402 yield self.data_o
403
404 def ports(self):
405 return list(self)
406
407
408 class StageCls(metaclass=ABCMeta):
409 """ Class-based "Stage" API. requires instantiation (after derivation)
410
411 see "Stage API" above.. Note: python does *not* require derivation
412 from this class. All that is required is that the pipelines *have*
413 the functions listed in this class. Derivation from this class
414 is therefore merely a "courtesy" to maintainers.
415 """
416 @abstractmethod
417 def ispec(self): pass # REQUIRED
418 @abstractmethod
419 def ospec(self): pass # REQUIRED
420 #@abstractmethod
421 #def setup(self, m, i): pass # OPTIONAL
422 @abstractmethod
423 def process(self, i): pass # REQUIRED
424
425
426 class Stage(metaclass=ABCMeta):
427 """ Static "Stage" API. does not require instantiation (after derivation)
428
429 see "Stage API" above. Note: python does *not* require derivation
430 from this class. All that is required is that the pipelines *have*
431 the functions listed in this class. Derivation from this class
432 is therefore merely a "courtesy" to maintainers.
433 """
434 @staticmethod
435 @abstractmethod
436 def ispec(): pass
437
438 @staticmethod
439 @abstractmethod
440 def ospec(): pass
441
442 #@staticmethod
443 #@abstractmethod
444 #def setup(m, i): pass
445
446 @staticmethod
447 @abstractmethod
448 def process(i): pass
449
450
451 class RecordBasedStage(Stage):
452 """ convenience class which provides a Records-based layout.
453 honestly it's a lot easier just to create a direct Records-based
454 class (see ExampleAddRecordStage)
455 """
456 def __init__(self, in_shape, out_shape, processfn, setupfn=None):
457 self.in_shape = in_shape
458 self.out_shape = out_shape
459 self.__process = processfn
460 self.__setup = setupfn
461 def ispec(self): return Record(self.in_shape)
462 def ospec(self): return Record(self.out_shape)
463 def process(seif, i): return self.__process(i)
464 def setup(seif, m, i): return self.__setup(m, i)
465
466
467 class StageChain(StageCls):
468 """ pass in a list of stages, and they will automatically be
469 chained together via their input and output specs into a
470 combinatorial chain.
471
472 the end result basically conforms to the exact same Stage API.
473
474 * input to this class will be the input of the first stage
475 * output of first stage goes into input of second
476 * output of second goes into input into third (etc. etc.)
477 * the output of this class will be the output of the last stage
478 """
479 def __init__(self, chain, specallocate=False):
480 self.chain = chain
481 self.specallocate = specallocate
482
483 def ispec(self):
484 return _spec(self.chain[0].ispec, "chainin")
485
486 def ospec(self):
487 return _spec(self.chain[-1].ospec, "chainout")
488
489 def _specallocate_setup(self, m, i):
490 for (idx, c) in enumerate(self.chain):
491 if hasattr(c, "setup"):
492 c.setup(m, i) # stage may have some module stuff
493 ofn = self.chain[idx].ospec # last assignment survives
494 o = _spec(ofn, 'chainin%d' % idx)
495 m.d.comb += eq(o, c.process(i)) # process input into "o"
496 if idx == len(self.chain)-1:
497 break
498 ifn = self.chain[idx+1].ispec # new input on next loop
499 i = _spec(ifn, 'chainin%d' % (idx+1))
500 m.d.comb += eq(i, o) # assign to next input
501 return o # last loop is the output
502
503 def _noallocate_setup(self, m, i):
504 for (idx, c) in enumerate(self.chain):
505 if hasattr(c, "setup"):
506 c.setup(m, i) # stage may have some module stuff
507 i = o = c.process(i) # store input into "o"
508 return o # last loop is the output
509
510 def setup(self, m, i):
511 if self.specallocate:
512 self.o = self._specallocate_setup(m, i)
513 else:
514 self.o = self._noallocate_setup(m, i)
515
516 def process(self, i):
517 return self.o # conform to Stage API: return last-loop output
518
519
520 class ControlBase(Elaboratable):
521 """ Common functions for Pipeline API
522 """
523 def __init__(self, stage=None, in_multi=None, stage_ctl=False):
524 """ Base class containing ready/valid/data to previous and next stages
525
526 * p: contains ready/valid to the previous stage
527 * n: contains ready/valid to the next stage
528
529 Except when calling Controlbase.connect(), user must also:
530 * add data_i member to PrevControl (p) and
531 * add data_o member to NextControl (n)
532 """
533 self.stage = stage
534
535 # set up input and output IO ACK (prev/next ready/valid)
536 self.p = PrevControl(in_multi, stage_ctl)
537 self.n = NextControl(stage_ctl)
538
539 # set up the input and output data
540 if stage is not None:
541 self.p.data_i = _spec(stage.ispec, "data_i") # input type
542 self.n.data_o = _spec(stage.ospec, "data_o") # output type
543
544 def connect_to_next(self, nxt):
545 """ helper function to connect to the next stage data/valid/ready.
546 """
547 return self.n.connect_to_next(nxt.p)
548
549 def _connect_in(self, prev):
550 """ internal helper function to connect stage to an input source.
551 do not use to connect stage-to-stage!
552 """
553 return self.p._connect_in(prev.p)
554
555 def _connect_out(self, nxt):
556 """ internal helper function to connect stage to an output source.
557 do not use to connect stage-to-stage!
558 """
559 return self.n._connect_out(nxt.n)
560
561 def connect(self, pipechain):
562 """ connects a chain (list) of Pipeline instances together and
563 links them to this ControlBase instance:
564
565 in <----> self <---> out
566 | ^
567 v |
568 [pipe1, pipe2, pipe3, pipe4]
569 | ^ | ^ | ^
570 v | v | v |
571 out---in out--in out---in
572
573 Also takes care of allocating data_i/data_o, by looking up
574 the data spec for each end of the pipechain. i.e It is NOT
575 necessary to allocate self.p.data_i or self.n.data_o manually:
576 this is handled AUTOMATICALLY, here.
577
578 Basically this function is the direct equivalent of StageChain,
579 except that unlike StageChain, the Pipeline logic is followed.
580
581 Just as StageChain presents an object that conforms to the
582 Stage API from a list of objects that also conform to the
583 Stage API, an object that calls this Pipeline connect function
584 has the exact same pipeline API as the list of pipline objects
585 it is called with.
586
587 Thus it becomes possible to build up larger chains recursively.
588 More complex chains (multi-input, multi-output) will have to be
589 done manually.
590 """
591 eqs = [] # collated list of assignment statements
592
593 # connect inter-chain
594 for i in range(len(pipechain)-1):
595 pipe1 = pipechain[i]
596 pipe2 = pipechain[i+1]
597 eqs += pipe1.connect_to_next(pipe2)
598
599 # connect front of chain to ourselves
600 front = pipechain[0]
601 self.p.data_i = _spec(front.stage.ispec, "chainin")
602 eqs += front._connect_in(self)
603
604 # connect end of chain to ourselves
605 end = pipechain[-1]
606 self.n.data_o = _spec(end.stage.ospec, "chainout")
607 eqs += end._connect_out(self)
608
609 return eqs
610
611 def _postprocess(self, i): # XXX DISABLED
612 return i # RETURNS INPUT
613 if hasattr(self.stage, "postprocess"):
614 return self.stage.postprocess(i)
615 return i
616
617 def set_input(self, i):
618 """ helper function to set the input data
619 """
620 return eq(self.p.data_i, i)
621
622 def __iter__(self):
623 yield from self.p
624 yield from self.n
625
626 def ports(self):
627 return list(self)
628
629 def elaborate(self, platform):
630 """ handles case where stage has dynamic ready/valid functions
631 """
632 m = Module()
633 m.submodules.p = self.p
634 m.submodules.n = self.n
635
636 if self.stage is not None and hasattr(self.stage, "setup"):
637 self.stage.setup(m, self.p.data_i)
638
639 if not self.p.stage_ctl:
640 return m
641
642 # intercept the previous (outgoing) "ready", combine with stage ready
643 m.d.comb += self.p.s_ready_o.eq(self.p._ready_o & self.stage.d_ready)
644
645 # intercept the next (incoming) "ready" and combine it with data valid
646 sdv = self.stage.d_valid(self.n.ready_i)
647 m.d.comb += self.n.d_valid.eq(self.n.ready_i & sdv)
648
649 return m
650
651
652 class BufferedHandshake(ControlBase):
653 """ buffered pipeline stage. data and strobe signals travel in sync.
654 if ever the input is ready and the output is not, processed data
655 is shunted in a temporary register.
656
657 Argument: stage. see Stage API above
658
659 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
660 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
661 stage-1 p.data_i >>in stage n.data_o out>> stage+1
662 | |
663 process --->----^
664 | |
665 +-- r_data ->-+
666
667 input data p.data_i is read (only), is processed and goes into an
668 intermediate result store [process()]. this is updated combinatorially.
669
670 in a non-stall condition, the intermediate result will go into the
671 output (update_output). however if ever there is a stall, it goes
672 into r_data instead [update_buffer()].
673
674 when the non-stall condition is released, r_data is the first
675 to be transferred to the output [flush_buffer()], and the stall
676 condition cleared.
677
678 on the next cycle (as long as stall is not raised again) the
679 input may begin to be processed and transferred directly to output.
680 """
681
682 def elaborate(self, platform):
683 self.m = ControlBase.elaborate(self, platform)
684
685 result = _spec(self.stage.ospec, "r_tmp")
686 r_data = _spec(self.stage.ospec, "r_data")
687
688 # establish some combinatorial temporaries
689 o_n_validn = Signal(reset_less=True)
690 n_ready_i = Signal(reset_less=True, name="n_i_rdy_data")
691 nir_por = Signal(reset_less=True)
692 nir_por_n = Signal(reset_less=True)
693 p_valid_i = Signal(reset_less=True)
694 nir_novn = Signal(reset_less=True)
695 nirn_novn = Signal(reset_less=True)
696 por_pivn = Signal(reset_less=True)
697 npnn = Signal(reset_less=True)
698 self.m.d.comb += [p_valid_i.eq(self.p.valid_i_test),
699 o_n_validn.eq(~self.n.valid_o),
700 n_ready_i.eq(self.n.ready_i_test),
701 nir_por.eq(n_ready_i & self.p._ready_o),
702 nir_por_n.eq(n_ready_i & ~self.p._ready_o),
703 nir_novn.eq(n_ready_i | o_n_validn),
704 nirn_novn.eq(~n_ready_i & o_n_validn),
705 npnn.eq(nir_por | nirn_novn),
706 por_pivn.eq(self.p._ready_o & ~p_valid_i)
707 ]
708
709 # store result of processing in combinatorial temporary
710 self.m.d.comb += eq(result, self.stage.process(self.p.data_i))
711
712 # if not in stall condition, update the temporary register
713 with self.m.If(self.p.ready_o): # not stalled
714 self.m.d.sync += eq(r_data, result) # update buffer
715
716 # data pass-through conditions
717 with self.m.If(npnn):
718 data_o = self._postprocess(result)
719 self.m.d.sync += [self.n.valid_o.eq(p_valid_i), # valid if p_valid
720 eq(self.n.data_o, data_o), # update output
721 ]
722 # buffer flush conditions (NOTE: can override data passthru conditions)
723 with self.m.If(nir_por_n): # not stalled
724 # Flush the [already processed] buffer to the output port.
725 data_o = self._postprocess(r_data)
726 self.m.d.sync += [self.n.valid_o.eq(1), # reg empty
727 eq(self.n.data_o, data_o), # flush buffer
728 ]
729 # output ready conditions
730 self.m.d.sync += self.p._ready_o.eq(nir_novn | por_pivn)
731
732 return self.m
733
734
735 class SimpleHandshake(ControlBase):
736 """ simple handshake control. data and strobe signals travel in sync.
737 implements the protocol used by Wishbone and AXI4.
738
739 Argument: stage. see Stage API above
740
741 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
742 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
743 stage-1 p.data_i >>in stage n.data_o out>> stage+1
744 | |
745 +--process->--^
746 Truth Table
747
748 Inputs Temporary Output Data
749 ------- ---------- ----- ----
750 P P N N PiV& ~NiR& N P
751 i o i o PoR NoV o o
752 V R R V V R
753
754 ------- - - - -
755 0 0 0 0 0 0 >0 0 reg
756 0 0 0 1 0 1 >1 0 reg
757 0 0 1 0 0 0 0 1 process(data_i)
758 0 0 1 1 0 0 0 1 process(data_i)
759 ------- - - - -
760 0 1 0 0 0 0 >0 0 reg
761 0 1 0 1 0 1 >1 0 reg
762 0 1 1 0 0 0 0 1 process(data_i)
763 0 1 1 1 0 0 0 1 process(data_i)
764 ------- - - - -
765 1 0 0 0 0 0 >0 0 reg
766 1 0 0 1 0 1 >1 0 reg
767 1 0 1 0 0 0 0 1 process(data_i)
768 1 0 1 1 0 0 0 1 process(data_i)
769 ------- - - - -
770 1 1 0 0 1 0 1 0 process(data_i)
771 1 1 0 1 1 1 1 0 process(data_i)
772 1 1 1 0 1 0 1 1 process(data_i)
773 1 1 1 1 1 0 1 1 process(data_i)
774 ------- - - - -
775 """
776
777 def elaborate(self, platform):
778 self.m = m = ControlBase.elaborate(self, platform)
779
780 r_busy = Signal()
781 result = _spec(self.stage.ospec, "r_tmp")
782
783 # establish some combinatorial temporaries
784 n_ready_i = Signal(reset_less=True, name="n_i_rdy_data")
785 p_valid_i_p_ready_o = Signal(reset_less=True)
786 p_valid_i = Signal(reset_less=True)
787 m.d.comb += [p_valid_i.eq(self.p.valid_i_test),
788 n_ready_i.eq(self.n.ready_i_test),
789 p_valid_i_p_ready_o.eq(p_valid_i & self.p.ready_o),
790 ]
791
792 # store result of processing in combinatorial temporary
793 m.d.comb += eq(result, self.stage.process(self.p.data_i))
794
795 # previous valid and ready
796 with m.If(p_valid_i_p_ready_o):
797 data_o = self._postprocess(result)
798 m.d.sync += [r_busy.eq(1), # output valid
799 eq(self.n.data_o, data_o), # update output
800 ]
801 # previous invalid or not ready, however next is accepting
802 with m.Elif(n_ready_i):
803 data_o = self._postprocess(result)
804 m.d.sync += [eq(self.n.data_o, data_o)]
805 # TODO: could still send data here (if there was any)
806 #m.d.sync += self.n.valid_o.eq(0) # ...so set output invalid
807 m.d.sync += r_busy.eq(0) # ...so set output invalid
808
809 m.d.comb += self.n.valid_o.eq(r_busy)
810 # if next is ready, so is previous
811 m.d.comb += self.p._ready_o.eq(n_ready_i)
812
813 return self.m
814
815
816 class UnbufferedPipeline(ControlBase):
817 """ A simple pipeline stage with single-clock synchronisation
818 and two-way valid/ready synchronised signalling.
819
820 Note that a stall in one stage will result in the entire pipeline
821 chain stalling.
822
823 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
824 travel synchronously with the data: the valid/ready signalling
825 combines in a *combinatorial* fashion. Therefore, a long pipeline
826 chain will lengthen propagation delays.
827
828 Argument: stage. see Stage API, above
829
830 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
831 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
832 stage-1 p.data_i >>in stage n.data_o out>> stage+1
833 | |
834 r_data result
835 | |
836 +--process ->-+
837
838 Attributes:
839 -----------
840 p.data_i : StageInput, shaped according to ispec
841 The pipeline input
842 p.data_o : StageOutput, shaped according to ospec
843 The pipeline output
844 r_data : input_shape according to ispec
845 A temporary (buffered) copy of a prior (valid) input.
846 This is HELD if the output is not ready. It is updated
847 SYNCHRONOUSLY.
848 result: output_shape according to ospec
849 The output of the combinatorial logic. it is updated
850 COMBINATORIALLY (no clock dependence).
851
852 Truth Table
853
854 Inputs Temp Output Data
855 ------- - ----- ----
856 P P N N ~NiR& N P
857 i o i o NoV o o
858 V R R V V R
859
860 ------- - - -
861 0 0 0 0 0 0 1 reg
862 0 0 0 1 1 1 0 reg
863 0 0 1 0 0 0 1 reg
864 0 0 1 1 0 0 1 reg
865 ------- - - -
866 0 1 0 0 0 0 1 reg
867 0 1 0 1 1 1 0 reg
868 0 1 1 0 0 0 1 reg
869 0 1 1 1 0 0 1 reg
870 ------- - - -
871 1 0 0 0 0 1 1 reg
872 1 0 0 1 1 1 0 reg
873 1 0 1 0 0 1 1 reg
874 1 0 1 1 0 1 1 reg
875 ------- - - -
876 1 1 0 0 0 1 1 process(data_i)
877 1 1 0 1 1 1 0 process(data_i)
878 1 1 1 0 0 1 1 process(data_i)
879 1 1 1 1 0 1 1 process(data_i)
880 ------- - - -
881
882 Note: PoR is *NOT* involved in the above decision-making.
883 """
884
885 def elaborate(self, platform):
886 self.m = m = ControlBase.elaborate(self, platform)
887
888 data_valid = Signal() # is data valid or not
889 r_data = _spec(self.stage.ospec, "r_tmp") # output type
890
891 # some temporaries
892 p_valid_i = Signal(reset_less=True)
893 pv = Signal(reset_less=True)
894 buf_full = Signal(reset_less=True)
895 m.d.comb += p_valid_i.eq(self.p.valid_i_test)
896 m.d.comb += pv.eq(self.p.valid_i & self.p.ready_o)
897 m.d.comb += buf_full.eq(~self.n.ready_i_test & data_valid)
898
899 m.d.comb += self.n.valid_o.eq(data_valid)
900 m.d.comb += self.p._ready_o.eq(~data_valid | self.n.ready_i_test)
901 m.d.sync += data_valid.eq(p_valid_i | buf_full)
902
903 with m.If(pv):
904 m.d.sync += eq(r_data, self.stage.process(self.p.data_i))
905 data_o = self._postprocess(r_data)
906 m.d.comb += eq(self.n.data_o, data_o)
907
908 return self.m
909
910 class UnbufferedPipeline2(ControlBase):
911 """ A simple pipeline stage with single-clock synchronisation
912 and two-way valid/ready synchronised signalling.
913
914 Note that a stall in one stage will result in the entire pipeline
915 chain stalling.
916
917 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
918 travel synchronously with the data: the valid/ready signalling
919 combines in a *combinatorial* fashion. Therefore, a long pipeline
920 chain will lengthen propagation delays.
921
922 Argument: stage. see Stage API, above
923
924 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
925 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
926 stage-1 p.data_i >>in stage n.data_o out>> stage+1
927 | | |
928 +- process-> buf <-+
929 Attributes:
930 -----------
931 p.data_i : StageInput, shaped according to ispec
932 The pipeline input
933 p.data_o : StageOutput, shaped according to ospec
934 The pipeline output
935 buf : output_shape according to ospec
936 A temporary (buffered) copy of a valid output
937 This is HELD if the output is not ready. It is updated
938 SYNCHRONOUSLY.
939
940 Inputs Temp Output Data
941 ------- - -----
942 P P N N ~NiR& N P (buf_full)
943 i o i o NoV o o
944 V R R V V R
945
946 ------- - - -
947 0 0 0 0 0 0 1 process(data_i)
948 0 0 0 1 1 1 0 reg (odata, unchanged)
949 0 0 1 0 0 0 1 process(data_i)
950 0 0 1 1 0 0 1 process(data_i)
951 ------- - - -
952 0 1 0 0 0 0 1 process(data_i)
953 0 1 0 1 1 1 0 reg (odata, unchanged)
954 0 1 1 0 0 0 1 process(data_i)
955 0 1 1 1 0 0 1 process(data_i)
956 ------- - - -
957 1 0 0 0 0 1 1 process(data_i)
958 1 0 0 1 1 1 0 reg (odata, unchanged)
959 1 0 1 0 0 1 1 process(data_i)
960 1 0 1 1 0 1 1 process(data_i)
961 ------- - - -
962 1 1 0 0 0 1 1 process(data_i)
963 1 1 0 1 1 1 0 reg (odata, unchanged)
964 1 1 1 0 0 1 1 process(data_i)
965 1 1 1 1 0 1 1 process(data_i)
966 ------- - - -
967
968 Note: PoR is *NOT* involved in the above decision-making.
969 """
970
971 def elaborate(self, platform):
972 self.m = m = ControlBase.elaborate(self, platform)
973
974 buf_full = Signal() # is data valid or not
975 buf = _spec(self.stage.ospec, "r_tmp") # output type
976
977 # some temporaries
978 p_valid_i = Signal(reset_less=True)
979 m.d.comb += p_valid_i.eq(self.p.valid_i_test)
980
981 m.d.comb += self.n.valid_o.eq(buf_full | p_valid_i)
982 m.d.comb += self.p._ready_o.eq(~buf_full)
983 m.d.sync += buf_full.eq(~self.n.ready_i_test & self.n.valid_o)
984
985 data_o = Mux(buf_full, buf, self.stage.process(self.p.data_i))
986 data_o = self._postprocess(data_o)
987 m.d.comb += eq(self.n.data_o, data_o)
988 m.d.sync += eq(buf, self.n.data_o)
989
990 return self.m
991
992
993 class PassThroughStage(StageCls):
994 """ a pass-through stage which has its input data spec equal to its output,
995 and "passes through" its data from input to output.
996 """
997 def __init__(self, iospecfn):
998 self.iospecfn = iospecfn
999 def ispec(self): return self.iospecfn()
1000 def ospec(self): return self.iospecfn()
1001 def process(self, i): return i
1002
1003
1004 class PassThroughHandshake(ControlBase):
1005 """ A control block that delays by one clock cycle.
1006
1007 Inputs Temporary Output Data
1008 ------- ------------------ ----- ----
1009 P P N N PiV& PiV| NiR| pvr N P (pvr)
1010 i o i o PoR ~PoR ~NoV o o
1011 V R R V V R
1012
1013 ------- - - - - - -
1014 0 0 0 0 0 1 1 0 1 1 odata (unchanged)
1015 0 0 0 1 0 1 0 0 1 0 odata (unchanged)
1016 0 0 1 0 0 1 1 0 1 1 odata (unchanged)
1017 0 0 1 1 0 1 1 0 1 1 odata (unchanged)
1018 ------- - - - - - -
1019 0 1 0 0 0 0 1 0 0 1 odata (unchanged)
1020 0 1 0 1 0 0 0 0 0 0 odata (unchanged)
1021 0 1 1 0 0 0 1 0 0 1 odata (unchanged)
1022 0 1 1 1 0 0 1 0 0 1 odata (unchanged)
1023 ------- - - - - - -
1024 1 0 0 0 0 1 1 1 1 1 process(in)
1025 1 0 0 1 0 1 0 0 1 0 odata (unchanged)
1026 1 0 1 0 0 1 1 1 1 1 process(in)
1027 1 0 1 1 0 1 1 1 1 1 process(in)
1028 ------- - - - - - -
1029 1 1 0 0 1 1 1 1 1 1 process(in)
1030 1 1 0 1 1 1 0 0 1 0 odata (unchanged)
1031 1 1 1 0 1 1 1 1 1 1 process(in)
1032 1 1 1 1 1 1 1 1 1 1 process(in)
1033 ------- - - - - - -
1034
1035 """
1036
1037 def elaborate(self, platform):
1038 self.m = m = ControlBase.elaborate(self, platform)
1039
1040 r_data = _spec(self.stage.ospec, "r_tmp") # output type
1041
1042 # temporaries
1043 p_valid_i = Signal(reset_less=True)
1044 pvr = Signal(reset_less=True)
1045 m.d.comb += p_valid_i.eq(self.p.valid_i_test)
1046 m.d.comb += pvr.eq(p_valid_i & self.p.ready_o)
1047
1048 m.d.comb += self.p.ready_o.eq(~self.n.valid_o | self.n.ready_i_test)
1049 m.d.sync += self.n.valid_o.eq(p_valid_i | ~self.p.ready_o)
1050
1051 odata = Mux(pvr, self.stage.process(self.p.data_i), r_data)
1052 m.d.sync += eq(r_data, odata)
1053 r_data = self._postprocess(r_data)
1054 m.d.comb += eq(self.n.data_o, r_data)
1055
1056 return m
1057
1058
1059 class RegisterPipeline(UnbufferedPipeline):
1060 """ A pipeline stage that delays by one clock cycle, creating a
1061 sync'd latch out of data_o and valid_o as an indirect byproduct
1062 of using PassThroughStage
1063 """
1064 def __init__(self, iospecfn):
1065 UnbufferedPipeline.__init__(self, PassThroughStage(iospecfn))
1066
1067
1068 class FIFOControl(ControlBase):
1069 """ FIFO Control. Uses SyncFIFO to store data, coincidentally
1070 happens to have same valid/ready signalling as Stage API.
1071
1072 data_i -> fifo.din -> FIFO -> fifo.dout -> data_o
1073 """
1074
1075 def __init__(self, depth, stage, in_multi=None, stage_ctl=False,
1076 fwft=True, buffered=False, pipe=False):
1077 """ FIFO Control
1078
1079 * depth: number of entries in the FIFO
1080 * stage: data processing block
1081 * fwft : first word fall-thru mode (non-fwft introduces delay)
1082 * buffered: use buffered FIFO (introduces extra cycle delay)
1083
1084 NOTE 1: FPGAs may have trouble with the defaults for SyncFIFO
1085 (fwft=True, buffered=False)
1086
1087 NOTE 2: data_i *must* have a shape function. it can therefore
1088 be a Signal, or a Record, or a RecordObject.
1089
1090 data is processed (and located) as follows:
1091
1092 self.p self.stage temp fn temp fn temp fp self.n
1093 data_i->process()->result->cat->din.FIFO.dout->cat(data_o)
1094
1095 yes, really: cat produces a Cat() which can be assigned to.
1096 this is how the FIFO gets de-catted without needing a de-cat
1097 function
1098 """
1099
1100 assert not (fwft and buffered), "buffered cannot do fwft"
1101 if buffered:
1102 depth += 1
1103 self.fwft = fwft
1104 self.buffered = buffered
1105 self.pipe = pipe
1106 self.fdepth = depth
1107 ControlBase.__init__(self, stage, in_multi, stage_ctl)
1108
1109 def elaborate(self, platform):
1110 self.m = m = ControlBase.elaborate(self, platform)
1111
1112 # make a FIFO with a signal of equal width to the data_o.
1113 (fwidth, _) = shape(self.n.data_o)
1114 if self.buffered:
1115 fifo = SyncFIFOBuffered(fwidth, self.fdepth)
1116 else:
1117 fifo = Queue(fwidth, self.fdepth, fwft=self.fwft, pipe=self.pipe)
1118 m.submodules.fifo = fifo
1119
1120 # store result of processing in combinatorial temporary
1121 result = _spec(self.stage.ospec, "r_temp")
1122 m.d.comb += eq(result, self.stage.process(self.p.data_i))
1123
1124 # connect previous rdy/valid/data - do cat on data_i
1125 # NOTE: cannot do the PrevControl-looking trick because
1126 # of need to process the data. shaaaame....
1127 m.d.comb += [fifo.we.eq(self.p.valid_i_test),
1128 self.p.ready_o.eq(fifo.writable),
1129 eq(fifo.din, cat(result)),
1130 ]
1131
1132 # connect next rdy/valid/data - do cat on data_o
1133 connections = [self.n.valid_o.eq(fifo.readable),
1134 fifo.re.eq(self.n.ready_i_test),
1135 ]
1136 if self.fwft or self.buffered:
1137 m.d.comb += connections
1138 else:
1139 m.d.sync += connections # unbuffered fwft mode needs sync
1140 data_o = cat(self.n.data_o).eq(fifo.dout)
1141 data_o = self._postprocess(data_o)
1142 m.d.comb += data_o
1143
1144 return m
1145
1146
1147 # aka "RegStage".
1148 class UnbufferedPipeline(FIFOControl):
1149 def __init__(self, stage, in_multi=None, stage_ctl=False):
1150 FIFOControl.__init__(self, 1, stage, in_multi, stage_ctl,
1151 fwft=True, pipe=False)
1152
1153 # aka "BreakReadyStage" XXX had to set fwft=True to get it to work
1154 class PassThroughHandshake(FIFOControl):
1155 def __init__(self, stage, in_multi=None, stage_ctl=False):
1156 FIFOControl.__init__(self, 1, stage, in_multi, stage_ctl,
1157 fwft=True, pipe=True)
1158
1159 # this is *probably* BufferedHandshake, although test #997 now succeeds.
1160 class BufferedHandshake(FIFOControl):
1161 def __init__(self, stage, in_multi=None, stage_ctl=False):
1162 FIFOControl.__init__(self, 2, stage, in_multi, stage_ctl,
1163 fwft=True, pipe=False)
1164
1165
1166 """
1167 # this is *probably* SimpleHandshake (note: memory cell size=0)
1168 class SimpleHandshake(FIFOControl):
1169 def __init__(self, stage, in_multi=None, stage_ctl=False):
1170 FIFOControl.__init__(self, 0, stage, in_multi, stage_ctl,
1171 fwft=True, pipe=False)
1172 """