If `sv.lfdup` was not available, `sv.lfdu` could be used to the same
effect, but RA would have to be *pre-subtracted by one element*, outside
of the loop. Due to the compactness of this highly hardware-parallelizable
If `sv.lfdup` was not available, `sv.lfdu` could be used to the same
effect, but RA would have to be *pre-subtracted by one element*, outside
of the loop. Due to the compactness of this highly hardware-parallelizable