From 76784f48e3a2cd1532fe860d289112425c79783b Mon Sep 17 00:00:00 2001 From: Andrey Miroshnikov Date: Mon, 21 Aug 2023 16:49:01 +0000 Subject: [PATCH] inorder_model: Documentation of the code so far. --- 3d_gpu/architecture/inorder_model.mdwn | 252 ++++++++++++++++++++++++- 1 file changed, 251 insertions(+), 1 deletion(-) diff --git a/3d_gpu/architecture/inorder_model.mdwn b/3d_gpu/architecture/inorder_model.mdwn index 112a823e3..e61c7a938 100644 --- a/3d_gpu/architecture/inorder_model.mdwn +++ b/3d_gpu/architecture/inorder_model.mdwn @@ -132,7 +132,257 @@ execute (not the case for mul/div etc., but will be dealt with later. **In-progress TODO** -# Code Explanation +# Code Explanation - *IN PROGRESS* + +*(Not all of the code has been explained, just the general classes.)* Source code: +## `Hazard` namedtuple data structure + +A `namedtuple` object stores the attributes of the register access. The +python `namedtuple` is immutable (like a normal tuple), while also allowing to +access elements by predefined names. Immutability is great because the register +access attributes won't change from fetch to execution stages, which is why a +normal `list` or `dict` wouldn't be appropriate. + +Unlike a normal dictionary, a `namedtuple` is also ordered (so the initially +defined order is preserved). See the +[python wiki on `namedtuple`](https://docs.python.org/3.7/library/collections.html#collections.namedtuple), +[online namedtuple tutorial](https://realpython.com/python-namedtuple/), +[sta]. + +`namedtuple` instances can also be stored in sets, which is exactly how it is +used with the `RegisterWrite` class. One instruction trace may contain zero or +more `Hazard` register access objects (depending on whether registers are +needed for the instruction). + +## `HazardProfiles` + +A dictionary of currently supported register file types. Each entry (register +file type) defines the number of read and write ports, written as a tuple, with +the first entry being the number of read ports, and second entry being the +number of write ports. + +Having multiple read and/or write ports means that multiple **different** +entries in the same register file can be read from and/or written to in the +same clock cycle. +This doesn't prevent a stall if the same register entry is used +by a consecutive instruction, even if a spare port is available +(Read-after-Write hazard). + +## Parsing trace file dump using `read_file` function + +The `CPU` model class takes as input, a single instruction trace `list` object. + +This trace `list` object, is produced by the function +`read_file` which itself reads an instruction trace file from modified +`ISACaller` ([link to code needed](LINK)). +From now on, the trace `list` object will simply be referred to as `trace`. + +Each line of the trace dump is of the form +`[{rw}:FILE:regnum:offset:width]* # insn` where: + +- `rw` is the register to be used for reading (operands), or writing +(to store result, condition codes, etc.). +- `FILE` is the register file type (GPR/integer, FPR/floating-point, etc. see +Additional Information section at the end of this page). +*(TODO: use section reference link instead)*. +- `regnum` is the register number +- `offset` *TODO: Perhaps the offset of data in bytes??? no idea (right now not +important, as examples all show 0 offset)* +- `width` is the length of the data in bits to be accessed from the register. +- `insn` is the full instruction written in PowerISA assembler. + +The block `[{rw}:FILE:regnum:offset:width]` is used zero or more times, +based on the total number of read and write registers used for the instruction. + +Example trace file with three instructions: + + r:GPR:0:0:64 w:GPR:1:0:64 # addi 1, 0, 0x0010 + r:GPR:0:0:64 w:GPR:2:0:64 # addi 2, 0, 0x1234 + r:GPR:1:0:64 r:GPR:2:0:64 # stw 2, 0(1) + +The instruction trace file is processed line by line, where each line split into +the register access atributes (from which a new namedtuple is created using +`_make()` and the `Hazard` definition; see +[python wiki on _make() method](https://docs.python.org/3.7/library/collections.html#collections.somenamedtuple._make)). + +Each line is converted to a `trace` object of the form: +`[insn, Hazard(...), Hazard(...), ...]`. An example trace looks like this: + + ['addi 1, 0, 0x0010', + Hazard(action='r', target='GPR', ident='0', offs='0',elwid='64'), + Hazard(action='w', target='GPR', ident='1', offs='0', elwid='64')] + +The function `read_file` yields (see [python wiki on yield]()) a single `trace` +for each line of the trace file. To produces a full list of +traces all the user needs to do is to call `read_file` with the filename of the +`ISACaller` instruction trace dump, and assign to a new variable (which will +end up being a list of `trace` objects, ready to be iterated over for the CPU +model). + +## RegisterWrite + +A class which is based on a Python set, and is used to keep track of current +registers used for writing (for detecting Read-after-Write Hazards). + +A [python wiki on sets](https://docs.python.org/3.7/tutorial/datastructures.html#sets) +is an unordered collection with **no duplicate elements**. + +By checking if next instruction's read registers match any of the write +registers in the RegWrite set, the model can raise a STALL. + +Anything in the set **MUST STALL** at the Decode phase because the +currently issued/executed instruction's result has not been written to the +register/s needed for the consecutive instruction. + +### Methods + + def __init__(self): + self.storage = set() + +Initialise `RegisterWrite` set. + + def expect_write(self, regs): + return self.storage.update(regs) + +If there are new registers to be written to, add them to the current +`RegisterWrite` set. + + def write_expected(self, regs): + return (len(self.storage.intersection(regs)) != 0) + +Boolean flag which is true if no read registers need to be written to (by +previous instruction). + + def retire_write(self, regs): + return self.storage.difference_update(regs) + +Remove write registers from `RegisterWrite` set if they match the given read +registers. + +## `get_input_regs` and `get_output_regs` functions + + + +## CPU class + +The `CPU` class models the in-order, single-issue core. Contains the +`RegisterWrite` set for tracking Read-after-Write Hazards, fetch, decode, issue, +and execute stages, as well as a `stall` flag for indicating if the CPU is +currently stalled. + +The input to the model is a trace `list` object. + +The main methods used during the running of the model is +`process_instructions()`, which is called every time an instruction trace +`list` object is read from a trace file. + +### Methods + + def __init__(self): + self.regs = RegisterWrite() + self.fetch = Fetch(self) + self.decode = Decode(self) + self.issue = Issue(self) + self.exe = Execute(self) + self.stall = False + + def reads_possible(self, regs): + # TODO: subdivide this down by GPR FPR CR-field. + # currently assumes total of 3 regs are readable at one time + possible = set() + r = regs.copy() + while len(possible) < 3 and len(r) > 0: + possible.add(r.pop()) + return possible + + def writes_possible(self, regs): + # TODO: subdivide this down by GPR FPR CR-field. + # currently assumes total of 1 reg is possible regardless of what it is + possible = set() + r = regs.copy() + while len(possible) < 1 and len(r) > 0: + possible.add(r.pop()) + return possible + + def process_instructions(self): + stall = self.stall + stall = self.fetch.process_instructions(stall) + stall = self.decode.process_instructions(stall) + stall = self.issue.process_instructions(stall) + stall = self.exe.process_instructions(stall) + self.stall = stall + if not stall: + self.fetch.tick() + self.decode.tick() + self.issue.tick() + self.exe.tick() + +## Execute class + +The `Execute` class models the execute phase of the processor. +Contains a list + +### Methods + + def __init__(self, cpu): + self.stages = [] + self.cpu = cpu + + def add_stage(self, cycles_away, stage): + while cycles_away > len(self.stages): + self.stages.append([]) + self.stages[cycles_away].append(stage) + + def add_instruction(self, insn, writeregs): + self.add_stage(2, {'insn': insn, 'writes': writeregs}) + + def tick(self): + self.stages.pop(0) # tick drops anything at time "zero" + + def process_instructions(self, stall): + instructions = self.stages[0] # get list of instructions + to_write = set() # need to know total writes + for instruction in instructions: + to_write.update(instruction['writes']) + # see if all writes can be done, otherwise stall + writes_possible = self.cpu.writes_possible(to_write) + if writes_possible != to_write: + stall = True + # retire the writes that are possible in this cycle (regfile writes) + self.cpu.regs.retire_write(writes_possible) + # and now go through the instructions, removing those regs written + for instruction in instructions: + instruction['writes'].difference_update(writes_possible) + return stall + +# Additional Information + +## On register file types + +Currently (20th Aug 2023), the following register files are included in the CPU +model: + +- General Purpose Registers (GPR) - stores integers (0-31 in default PowerISA, +0-127 for Libre-SOC with SVP64) +- Floating Point Registers (FPR) - stores floating-point numbers +- Condition Register (CR) - broken up into 4-bit fields +- Condition Register Fields (CRf) - stores arithmetic condition of an operation +(less than, greater than, equal to zero, overflow) +- Fixed-Point Exception Register (XER) +- Machine State Register (MSR) +- Floating-Point Status and Control Register (FPSCR) +- Program Counter (PC); PowerISA spec primarilly calls this *Current +Instruction Address (CIA)*. See PowerISA v3.1, section 1.3.4 Description of +Instruction Operation +- Slow Special Purpose Registers (SPRs) +- Fast SPR (SPRf) + +*TODO: Special Purpose Registers and fields need better explation. The initial +writer of this page (Andrey) has very little understanding of whether SPR is +actually a register, or if it's just a category of registers (XER, etc.)* + +See the [PowerISA 3.1 spec](LINK) for detailed information on register files +(Book I, Chapters 1.3.4, 2.3, 3.2, 4.2, 5.2, 5.3). -- 2.30.2