add notes on 2024-01-23 meeting. terminated due to harrassment
[libreriscv.git] / openpower / prefix_codes.mdwn
1 # Prefix-code encode/decode acceleration
2
3 This is useful for Huffman codes, and other prefix-codes, which are used a lot in common formats:
4
5 * DEFLATE (zip, png, gzip, etc.)
6 * JPEG
7 * MP3
8 * MPEG-2/H.262 (DVD video)
9 * zstd
10 * Brotli
11 * etc.
12
13 Links:
14
15 * <https://gist.github.com/kirlf/2eb242f225f9bfed4ecbfc8e1e2f5f71>
16 * <https://en.m.wikipedia.org/wiki/Huffman_coding>
17 * <https://bugs.libre-soc.org/show_bug.cgi?id=933>
18
19 # Prefix-code decode description
20
21 `pcdec. RT,RA,RB,RC,imm`
22
23 |0 |6 |11 |16 |21 |26 |31 |
24 | PO | RT | RA | RB | RC | XO | imm |
25
26 if `imm` is 1, at most one code-word is decoded -- useful for things like DEFLATE that alternate code words with other bits, or use multiple binary code trees. if `imm` is 0, it decodes multiple code-words
27
28 The binary code tree is encoded in `RB` like so:
29
30 ```
31 t[i] = (RB >> i) & 0x1
32
33 |
34 +-----------+-----------+
35 | |
36 0 1
37 | |
38 +---t[2]----+ +---t[3]----+
39 | | | |
40 0 1 0 1
41 | | | |
42 +t[4]-+ +t[5]-+ +t[6]-+ +t[7]-+
43 | | | | | | | |
44 0 1 0 1 0 1 0 1
45 | | | | | | | |
46 t[8] t[9] t[10] t[11] t[12] t[13] t[14] t[15]
47
48 and so on for t[16..]
49 ```
50
51 Decoding a code word works by walking on the tree from the root to the children, matching each passed 0 or 1 to the next read input bit in RA in LSB to MSB order. When `t[i]` is set, then a valid code word was read and `i` is written to the next byte of output in RT in LSB to MSB order. When no set `t[i]` is encountered, and there are still input bits left, then the code word is >5-bits, so SO/OV/OV32 are set, and decoding stops.
52
53 [[!inline pages="openpower/isa/prefix_codes" quick="yes" raw="yes" ]]
54
55 # [DRAFT] Prefix-code encode
56
57 TODO