- # clear the shadow register when it does not match, and OR every selected shadow register
- # part to form the output. This can save a significant amount of logic; the size of
- # a complete k-OR or k-MUX gate tree for n inputs is `s = ceil((n - 1) / (k - 1))`,
- # and its logic depth is `ceil(log_k(s))`, but a 4-LUT can implement either a 4-OR or
- # a 2-MUX gate.
+ # AND the shadow register chunk with the comparator output, and OR all of those together.
+ # If the toolchain doesn't already synthesize multiplexer trees this way, this trick can
+ # save a significant amount of logic, since e.g. one 4-LUT can pack one 2-MUX, but two
+ # 2-AND or 2-OR gates.