Pipelining is easier IMO, at least with SSA. You have a feasible linear schedule and simply lay it around the torus, so everything is done in scalar order. While an instruction may contain operations from different iterations, the arguments of those operations are unambiguously the ones that belong to that iteration so aliasing is not an issue, no more than it is with simple unrolling. The annoying part is the loop prologue and epilogue, and there we have hardware help.
Reply To: code examples? Ivan Godard