add tree_reduction and pop_count based off of dead-code-elimination of prefix_sum_ops