OK. I’m still curious whether deferred loads could help with pipelined loops in the way I suggested.
We might have a more complex loop where the compiler can’t tell whether memory locations will alias. For example:
for (int i = 0; i < N; i++)
A[i] = A[f(i)] + 1;