lxo/ChangeLog

   1 2020-12-20
   2
   3         * 532: Implemented logic for mode-switching 32-bit insns with 6
   4         bits for the opcode, a 16-bit embedded compressed insn, and 10
   5         bits corresponding to subsequent insns, to tell whether or not
   6         each of them is compressed.  This nearly doubled the compression
   7         rate, using one such mode-switching insn per 3 compressed insns.
   8         (1:48)
   9
  10 2020-12-14
  11
  12         * 532: Reported on compression ratio findings and analyses.
  13         (1:06)
  14
  15 2020-12-13
  16
  17         * 532: Questioned some bullets under 16-imm opcodes.  Implemented
  18         condition register and system opcodes, 16-imm opcodes, extended
  19         load and store to cover 16-imm modes, condition bit expression
  20         parsing and finally bc 16-imm and bclr 10- and 16-bit opcodes.
  21         Tested a bit by visual inspection, introduced logic to backtrack
  22         into 32-bit and count such pairs as 10-bit nop + 16-imm insn,
  23         followed by 32-bit.  Fixed size estimation: count[2] was still
  24         counted as 16+16-imm, rather than a single 16-imm.  (5:30)
  25
  26 2020-12-06
  27
  28         * 532: Adjusted the logic in comp16-v1-skel.py for 16-bit 16-imm
  29         rather than the 16+16 I'd invented.  Implemented the most relevant
  30         opcodes for 10-bit, and many of the 16-bit ones too.  Not yet
  31         implemented are conditional branches, Immediate, CR and System
  32         opcodes.  With all of nop, unconditional branch, ld/st,
  33         arithmetic, logical and floating-point, we get less than 3%
  34         compression in GCC, with not-entirely-unreasonable reg subsets.
  35         It's not looking good.  (8:27)
  36
  37 2020-12-02
  38
  39         * Microwatts meeting.
  40         * 238: Added some thoghts on bl and blr, and implications about
  41         modes.  Also detailed my worries about how to preserve dynamic
  42         state, specifically switch-back-to-compressed-after-insn, across
  43         interrupts.  (1:44)
  44
  45 2020-11-30
  46
  47         * 238: Settled the N-without-M issue, it was likely an error in
  48         the tables.  Raised an inconsistency in decoder pseudocode's
  49         reversal of M and N.  Returned to the uncertainty and need for
  50         specifying how to handle conflicts between
  51         standard-then-compressed followed by 10-bit with M=0.  Raised
  52         issue of missing documentation that branch targets are always
  53         uncompressed, not just 32-bit aligned.  Raised issue of the
  54         purpose of M and N bits, particularly in unconditional branches.
  55         Explained why I believe phase 1 decoder hsa to look at Cmaj.m bits
  56         to tell whether or not N is there, brought crnand and crand
  57         encodings as example, and asked whether crand with M=0 should
  58         switch to 32-bit mode for only one insn, because the bit that
  59         usually holds N=1, or permanently, because there's no N field in
  60         the applicable encoding.  (2:33)
  61
  62         * 238: Detailed the motivations for my proposal of bit-shuffling
  63         in the 16-bit encoding, to reduce wires and selections in the
  64         realigning muxer.  Restated my question on N without M as I can't
  65         relate the answer with the question, it appears to have been
  66         misunderstood.  Further expanded on the advantages of moving the
  67         Cmaj.m and M bits as suggested, even going as far as enabling an
  68         extended compressed opcode reusing the bit that signals a match
  69         for a 10-bit insn in uncompressed mode.  (3:29)
  70
  71 2020-11-29
  72
  73         * 238: Noted some apparent contradictions in the rejection of
  74         extended 16-bit insns in the face of 16+16-bit insns.  Luke hit me
  75         with clarification that there's no such thing as a 16+16-bit insn
  76         in compressed mode, and I could see how I'd totally made it up by
  77         myself by reviewing the proposal.  Hit and asked other questions:
  78         what's the N for when there's no M, and what are the SV prefixes
  79         mentioned there, now that I no longer assume them to be something
  80         like extend-next.  Then I recorded some thoughts on minimizing the
  81         bits the muxer has to look into by making the bits that encode N,
  82         Cmaj.m and M onto the same bits that, in traditional mode, encode
  83         the primary opcode.  Finally, I was hit by the realization that,
  84         if we change the perspective from "uncompressed insns used to be
  85         32-bit only" to "uncompressed can be 32- or 16-bit depending on
  86         the opcode", on account of the 10-bit insns, the need for taking
  87         the opcode into account to tell whether we're looking at a 16- or
  88         32-bit insn, so why is it ok there, but not ok in compressed mode?
  89         Finally, I propose an encoding scheme that encodes lengths of
  90         subsequent insns in an early insn, achieving more coverage for
  91         16-bit insns, better limit compression, far more flexible mode
  92         switching, enabling savings at far more sparse settings, and
  93         without eating up a pair of primary opcodes: the 32-bit
  94         mode-switching insn could even be an extended opcode, though it
  95         would probably not have as many pre-length encoding bits then.  It
  96         would fit an entire 16-bit insn, which could do useful work, or
  97         queue up further pre-length bits, that correspond to static
  98         upcoming insns and tell whether to decode them as 32-bit or as
  99         (pairs of?) 16-bit ones.  Compared max ratio, representation
 100         overhead, and break-even density.  Shared some more thoughts on
 101         48- and 64-bit insns.  (7:39)
 102
 103         * 532: Got a little confused about some encodings; it's not clear
 104         whether the N and M bits in 16-bit instructions have uniform
 105         interpretation, or whether some proposed opcodes are repurposing
 106         them.  I'm surprised with such short immediate operands in the
 107         immediate instructions, if they don't get a 16-bit extension, or
 108         otherwise with the apparent requirement for an extended 16-bit
 109         immediate for something as simple as an mr encoded as addi.  Asked
 110         for clarification.  Not sure about how to proceed before I get it;
 111         the logic of the estimator would be too significantly impacted.
 112         (2:48)
 113
 114 2020-11-28
 115
 116         * 532: Figured out and implemented the logic to infer mode
 117         switching for best compression under attempt 1 proposed encoding,
 118         namely with 10-bit insns, 16-bit insns, 16+16-bit insns, and
 119         32-bit insns.  10-bit insns appear in uncompressed mode, and can
 120         be followed by insns in either mode; 16-bit ones appear in
 121         compressed mode, and can remain in compressed mode, or switch to
 122         uncomprssed mode for 1 insn or for good; 16+16-bit ones appear in
 123         compressed mode, and cannot switch modes; 32-bit ones appear only
 124         in uncompressed mode, or in the single-insn slot after a 16-bit
 125         that requests it.  If we find a 16-bit insn while we're in
 126         uncompressed mode, use a 10-bit nop to tentatively switch.  Insns
 127         that can be encoded in 10-bits, but appear in compressed mode, had
 128         better be encoded in 16-bits, for that offers further subsequent
 129         encoding options, without downsides for size estimation.  Insns
 130         that can be encoded as 16+16-bit decay to 32-bit if in
 131         uncompressed mode, or if, after a sequence thereof, a later insn
 132         forces a switch to 32-bit mode without an intervening switching
 133         insn.  Still missing: the code to select what insns can be encoded
 134         in what modes.  (6:42)
 135
 136         * 532: Implemented a skeleton for compression ratio estimation,
 137         initially with the simpler mode switching of the 8-bit nop,
 138         odd-address 16-bit insns.  Next, rewrite it for all the complexity
 139         of mode switching envisioned for the "attempt 1" proposal.  (2:02)
 140
 141 2020-11-23
 142
 143         * 238: Debating various possibilities of 16-bit encoding.  (5:20)
 144
 145         * 532: Wrote a histogram python script, that breaks counts down
 146         per opcode, and within them, by operands.  (2:05)
 147
 148 2020-11-22
 149
 150         * 529: Brought up the possibilities of using 8-bit nops to switch
 151         between modes, so that 16-bit insns would be at odd addresses, so
 152         that we could use the full 16-bits; of using 2-operand insns
 153         instead of 3- for 16-bit mode so as to increase the coverage of
 154         the compact encoding.
 155         * 238: Luke moved the comment above here, where it belonged.
 156         * 529: Elaborated how using actual odd-addresses for 16-bit insns
 157         would be dealt with WRT endianness.  Prompted by luke, added it to
 158         the wiki.
 159         * Wiki: Added self to team.  (11:50)
 160
 161 2020-11-21
 162
 163         * 532: Wrote patch for binutils to print insn histogram.
 164         * Mission: Restated the proposal of adding "and users" to the
 165         mission statement, next to customers, as those we wish to enable
 166         to trust our products.  (6:48)
 167
 168 2020-11-20
 169
 170         Reposted join message to the correct list.
 171         * 238: Started looking into it, from
 172         https://libre-soc.org/openpower/sv/16_bit_compressed/
 173
 174 2020-11-19
 175
 176         Joined.