+
+Re-running the auto-generator tool, we note the following output has
+been created, and match it against the corresponding hand-generated (old)
+code:
+
+ `ifdef SDRAM
+ mkConnection (fabric.v_to_slaves
+ [fromInteger(valueOf(Sdram_slave_num))],
+ sdram.axi4_slave_sdram); //
+ mkConnection (fabric.v_to_slaves
+ [fromInteger(valueOf(Sdram_cfg_slave_num))],
+ sdram.axi4_slave_cntrl_reg); //
+ `endif
+
+ // fabric connections
+ mkConnection (fabric.v_to_slaves
+ [fromInteger(valueOf(SDR0_fastslave_num))],
+ sdr0.axi4_slave_sdram);
+ mkConnection (fabric.v_to_slaves
+ [fromInteger(valueOf(SDR0_fastslave_num))],
+ sdr0.axi4_slave_cntrl_reg);
+
+Immediately we can spot an issue: whilst the correctly-named slave(s) have
+been added, they have been added with the *same* fabric slave index. This
+is unsatisfactory and needs resolving.
+
+Here we need to explain a bit more about what is going on. The fabric
+on an AXI4 Bus is allocated numerical slave numbers, and each slave is
+also allocated a memory-mapped region that must be resolved in a bi-directional
+fashion. i.e whenever a particular memory region is accessed, the AXI
+slave peripheral responsible for dealing with it **must** be correctly
+identified. So this requires some further crucial information, which is
+the size of the region that is to be allocated to each slave device. Later
+this will be extended to being part of the specification, but for now
+it is auto-allocated based on the size. As a huge hack, it is allocated
+in 32-bit chunks, as follows:
+
+class sdram(PBase):
+
+ def num_axi_regs32(self):
+ return [0x400000, # defines an entire memory range (hack...)
+ 12] # defines the number of configuration regs
+
+So after running the autogenerator again, to confirm that this has
+generated the correct code, we examine several files, starting with
+fast\+memory\_map.bsv:
+
+ /*====== Fast peripherals Memory Map ======= */
+ `define SDR0_0_Base 'h50000000
+ `define SDR0_0_End 'h5FFFFFFF // 4194304 32-bit regs
+ `define SDR0_1_Base 'h60000000
+ `define SDR0_1_End 'h600002FF // 12 32-bit regs
+
+This looks slightly awkward (and in need of an external specification
+section for addresses) but is fine: the range is 1GB for the main
+map and covers 12 32-bit registers for the SDR Config map.
+Next we note the slave numbering:
+
+ typedef 0 SDR0_0__fastslave_num;
+ typedef 1 SDR0_1__fastslave_num;
+ typedef 2 FB0_fastslave_num;
+ typedef 3 LCD0_fastslave_num;
+ typedef 3 LastGen_fastslave_num;
+ typedef TAdd#(LastGen_fastslave_num,1) Sdram_slave_num;
+ typedef TAdd#(Sdram_slave_num ,`ifdef SDRAM 1 `else 0 `endif )
+ Sdram_cfg_slave_num;
+
+Again this looks reasonable and we may subsequently (carefully! noting
+the use of the TAdd# chain!) remove the #define for Sdram\_cfg\_slave\_num.
+The next phase is to examine the fn\_addr\_to\_fastslave\_num function,
+where we note that there were *two* hand-created sections previously,
+now joined by two *auto-generated* sections:
+
+ function Tuple2 #(Bool, Bit#(TLog#(Num_Fast_Slaves)))
+ fn_addr_to_fastslave_num (Bit#(`PADDR) addr);
+
+ if(addr>=`SDRAMMemBase && addr<=`SDRAMMemEnd)
+ return tuple2(True,fromInteger(valueOf(Sdram_slave_num))); <--
+ else if(addr>=`DebugBase && addr<=`DebugEnd)
+ return tuple2(True,fromInteger(valueOf(Debug_slave_num))); <--
+ `ifdef SDRAM
+ else if(addr>=`SDRAMCfgBase && addr<=`SDRAMCfgEnd )
+ return tuple2(True,fromInteger(valueOf(Sdram_cfg_slave_num)));
+ `endif
+
+ ...
+ ...
+ if(addr>=`SDR0_0_Base && addr<=`SDR0_0_End) <--
+ return tuple2(True,fromInteger(valueOf(SDR0_0__fastslave_num)));
+ else
+ if(addr>=`SDR0_1_Base && addr<=`SDR0_1_End) <--
+ return tuple2(True,fromInteger(valueOf(SDR0_1__fastslave_num)));
+ else
+ if(addr>=`FB0Base && addr<=`FB0End)
+
+Now, here is where, in a slightly unusual unique set of circumstances, we
+cannot just remove all instances of this address / typedef from the template
+code. Looking in the shakti-core repository's src/lib/MemoryMap.bsv file,
+the SDRAMMemBase macro is utilise in the is\_IO\_Addr function. So as a
+really bad hack, which will need to be properly resolved, whilst the
+hand-generated sections from fast\_tuple2\_template.bsv are removed,
+and the corresponding (now redundant) defines in src/core/core\_parameters.bsv
+are commented out, some temporary typedefs to deal with the name change are
+also added:
+
+ `define SDRAMMemBase SDR0_0_Base
+ `define SDRAMMemEnd SDR0_0_End
+
+This needs to be addressed (pun intended) including being able to specify
+the name(s) of the configuration parameters, as well as specifying which
+memory map range they must be added to.
+
+Now however finally, after carefully comparing the hard-coded fabric
+connections to what was formerly named sdram, we may remove the mkConnections
+that drop sdram.axi4\_slave\_sdram and its associated cntrl reg from
+the soc\_template.bsv file.
+
+## Connecting up the pins
+
+We are still not done! It is however worth pointing out that if this peripheral
+were not wired into the pinmux, we would in fact be finished. However there
+is a task that (previously having been left to outside tools) now needs to
+be specified, which is to connect the sdram's pins, declared in this
+instance in Ifc\_sdram\_out, and the PeripheralSideSDR instance that
+was kindly / strategically / thoughtfully / absolutely-necessarily exported
+from slow\_peripherals for exactly this purpose.
+
+Recall earlier that we took a cut/paste copy of the flexbus.py code. If
+we now examine socgen.bsv we find that it contains connections to pins
+that match the FlexBus specification, not SDRAM. So, returning to the
+declaration of the Ifc\_sdram\_out interface, we first identify the
+single-bit output-only pins, and add a mapping table between them:
+
+ class sdram(PBase):
+
+ def pinname_out(self, pname):
+ return {'sdrwen': 'ifc_sdram_out.osdr_we_n',
+ 'sdrcsn0': 'ifc_sdram_out.osdr_cs_n',
+ 'sdrcke': 'ifc_sdram_out.osdr_cke',
+ 'sdrrasn': 'ifc_sdram_out.osdr_ras_n',
+ 'sdrcasn': 'ifc_sdram_out.osdr_cas_n',
+ }.get(pname, '')
+
+Re-running the tool confirms that the relevant mkConnections are generated:
+
+ //sdr {'action': True, 'type': 'out', 'name': 'sdrcke'}
+ mkConnection(slow_peripherals.sdr0.sdrcke,
+ sdr0_sdrcke_sync.get);
+ mkConnection(sdr0_sdrcke_sync.put,
+ sdr0.ifc_sdram_out.osdr_cke);
+ //sdr {'action': True, 'type': 'out', 'name': 'sdrrasn'}
+ mkConnection(slow_peripherals.sdr0.sdrrasn,
+ sdr0_sdrrasn_sync.get);
+ mkConnection(sdr0_sdrrasn_sync.put,
+ sdr0.ifc_sdram_out.osdr_ras_n);
+
+Next, the multi-value entries are tackled (both in and out). At present
+the code is messy, as it does not automatically detect the multiple numerical
+declarations, nor that the entries are sometimes inout (in, out, outen),
+so it is *presently* done by hand:
+
+ class sdram(PBase):
+
+ def _mk_pincon(self, name, count, typ):
+ ret = [PBase._mk_pincon(self, name, count, typ)]
+ assert typ == 'fast' # TODO slow?
+ for pname, stype, ptype in [
+ ('dqm', 'osdr_dqm', 'out'),
+ ('ba', 'osdr_ba', 'out'),
+ ('ad', 'osdr_addr', 'out'),
+ ('d_out', 'osdr_dout', 'out'),
+ ('d_in', 'ipad_sdr_din', 'in'),
+ ('d_out_en', 'osdr_den_n', 'out'),
+ ]:
+ ret.append(self._mk_vpincon(name, count, typ, ptype, pname,
+ "ifc_sdram_out.{0}".format(stype)))
+
+This generates *one* mkConnection for each multi-entry pintype, and here we
+match up with the "InterfaceMultiBus" class from the specification side,
+where pin entries with numerically matching names were "grouped" into single
+multi-bit declarations.
+
+## Adjusting the BSV Interface to a get/put style
+
+For various reasons, related to BSV not permitting wires to be connected
+back-to-back inside the pinmux code, a get/put style of interface had to
+be done. This requirement has a knock-on effect up the chain into the
+actual peripheral code. So now the actual interface (Ifc\_sdram\_out)
+has to be converted. All straight methods (outputs) are converted to Get,
+and Action methods (inputs) converted to Put. Also, it is just plain
+sensible not to use Bool but to use Bit#, and for the pack / unpack to
+be carried out in the interface. After conversion, the code looks like this:
+
+ interface Ifc_sdram_out;
+ (*always_enabled, always_ready*)
+ interface Put#(Bit#(64)) ipad_sdr_din;
+ interface Get#(Bit#(64)) osdr_dout;
+ interface Get#(Bit#(64)) osdr_den_n;
+ interface Get#(Bit#(1)) osdr_cke;
+ interface Get#(Bit#(1)) osdr_cs_n;
+ interface Get#(Bit#(1)) osdr_ras_n;
+ interface Get#(Bit#(1)) osdr_cas_n;
+ interface Get#(Bit#(1)) osdr_we_n;
+ interface Get#(Bit#(8)) osdr_dqm;
+ interface Get#(Bit#(2)) osdr_ba;
+ interface Get#(Bit#(13)) osdr_addr;
+
+ method Bit#(9) sdram_sdio_ctrl;
+ interface Clock sdram_clk;
+ endinterface
+
+Note that osdr\_den\_n is now **64** bit **not** 8, as discussed above.
+After conversion, the code looks like this:
+
+ interface Ifc_sdram_out ifc_sdram_out;
+
+ interface ipad_sdr_din = interface Put
+ method Action put(Bit#(64) in)
+ sdr_cntrl.ipad_sdr_din <= in;
+ endmethod
+ endinterface;
+
+ interface osdr_dout = interface Get
+ method ActionValue#(Bit#(64)) get;
+ return sdr_cntrl.osdr_dout();
+ endmethod
+ endinterface;
+
+ interface osdr_den_n = interface Get
+ method ActionValue#(Bit#(64)) get;
+ Bit#(64) temp;
+ for (int i=0; i<8; i=i+1) begin
+ temp[i*8] = sdr_cntrl.osdr_den_n[i];
+ end
+ return temp;
+ endmethod
+ endinterface;
+
+ interface osdr_cke = interface Get
+ method ActionValue#(Bit#(1)) get;
+ return pack(sdr_cntrl.osdr_cke());
+ endmethod
+ endinterface;
+
+ ...
+ ...
+
+ endinterface;
+
+Note that the data input is quite straightforward, as is data out,
+and cke: whether 8-bit, 13-bit or 64-bit, the conversion process is
+mundane, with only Bool having to be converted to Bit#(1) with a call
+to pack. The data-enable however is a massive hack: whilst 64 enable
+lines come in, only every 8th bit is actually utilised and passed
+through. Whether this should be changed is a matter for debate that
+is outside of the scope of this document.
+
+Lastly for this phase we have two anomalous interfaces, exposing
+the control registers and the clock, that have been moved out of
+Ifc\_sdram\_out, to the level above (Ifc\_sdr\_slave) so that the sole
+set of interfaces exposed for inter-connection by the auto-generator is
+related exclusively to the actual pins. Resolution of these two issues
+is currently outside of the scope of this document.
+
+## Clock synchronisation
+
+Astute readers, if their heads have not exploded by this point, will
+have noticed earlier that the SDRAM instance was declared with an entirely
+different clock domain from slow peripherals. Whilst the pinmux is
+completely combinatorial logic that is entirely unclocked, BSV has no
+means of informing a *module* of this fact, and consequently a weird
+null-clock splicing trick is required.
+
+This code is again auto-generated. However, it is slightly tricky to
+describe and there are several edge-cases, including ones where the
+peripheral is a slow peripheral (UART is an example) that is driven
+from a UART clock but it is connected up inside the slow peripherals
+code; FlexBus is a Fast Bus peripheral that needs syncing up; RGB/TTL
+likewise, but JTAG is specifically declared inside the SoC and passed
+through, but its instantiation requires a separate incoming clock.
+
+Examining the similarity between the creation of an SDRAM instance
+and the JTAG instance, the jtag code therefore looks like it is the
+best candidate fit. However, it passes through a reset signal as well.
+Instead, we modify this to create clk0 but use the slow\_reset
+signal:
+
+ class sdram(PBase):
+
+ def get_clk_spc(self, typ):
+ return "tck, slow_reset"
+
+ def get_clock_reset(self, name, count):
+ return "slow_clock, slow_reset"
+
+The resultant synchronisation code, that accepts pairs of clock/reset
+tuples in order to take care of the prerequisite null reset and clock
+"synchronisation", looks like this:
+
+ Ifc_sync#(Bit#(1)) sdr0_sdrcke_sync <-mksyncconnection(
+ clk0, slow_reset, slow_clock, slow_reset);
+ Ifc_sync#(Bit#(1)) sdr0_sdrrasn_sync <-mksyncconnection(
+ clk0, slow_reset, slow_clock, slow_reset);
+ ...
+ ...
+ Ifc_sync#(Bit#(64)) sdr0_d_in_sync <-mksyncconnection(
+ slow_clock, slow_reset, clk0, slow_reset);
+
+Note that inputs are in reverse order from outputs. With the inclusion of
+clock synchronisation, automatic chains of mkConnections are set up between
+the actual peripheral and the peripheral side of the pinmux:
+
+ mkConnection(slow_peripherals.sdr0.sdrcke,
+ sdr0_sdrcke_sync.get);
+ mkConnection(sdr0_sdrcke_sync.put,
+ sdr0.ifc_sdram_out.osdr_cke);
+
+Interestingly, if *actual* clock synchronisation were ever to be needed,
+it could easily be taken care of with relatively little extra work. However,
+it is worth emphasising that the pinmux is *entirely* unclocked zero-reset
+combinatorial logic.
+
+# Conclusion
+
+This is not a small project, by any means.