Your sample code is a progressive sum rather than a reduction, although of course the final element of a progressive sum is the overall reduction value. Alternate is for reductions, and as a result is independent of the value of d. It’s not clear to me how to have a d-independent stage for a progressive sum, or even if it is possible. You got an algorithm, or a proof that it is impossible?
Currently the shuffle op can provide d-dependent staging as used by your code snippet; the drawback to shuffle is that it is expensive in entropy, hence the introduction of alternate as a special case. Clearly we could add a set of d-dependent ops, covering all possible vector widths on a given member, to replace the shuffles. It’s not clear that progressive sum (or progressive anything in general) is common enough to be worth the clutter in the instruction set. If you can come up with a d-independent stage then I’d be much more inclined to put it in the Mill op set.