Mill Computing, Inc

Participant

January 8, 2023 at 10:15 am

Post count: 23

Many, perhaps nearly all PGO optimizations are counter-productive on a Mill. Thus for example unrolling loops prevents software pipelining and forces spill/fill of transients that otherwise would live only on the belt for their lifetime; shut unrolling off for better performance.

Maybe this is a vocabulary problem, but there’s definitely a kind of unrolling that the Mill benefits from; we could call it “horizontal” unrolling as opposed to “vertical” unrolling which would be the traditional version.

But if you have code that looks like this:


LOOP:
    CON(...),
    ADD(...), LOAD(...),
    STORE(...);

If your Mill is wide enough, you will definitely benefit from turning it into:

`
LOOP:
CON(…), CON(…), CON(…),
ADD(…), LOAD(…), ADD(…), LOAD(…), ADD(…), LOAD(…),
STORE(…), STORE(…), STORE(…);
`

I don’t know exactly how you’d call that operation, but to me it’s “unrolling” of a sort. And it definitely hits that “efficiency / code size” tradeoff that PGO helps optimize.

Many, perhaps nearly all PGO optimizations are counter-productive on a Mill. […] It is also unclear how much function and block reordering will actually pay;

Either way, as long as your codegen is based on heuristics, you’ll benefit from PGO somewhere.

To give a Mill-specific example, there’s load delays. Your current heuristic is “as long as you can get away with”, but a PGO analysis might say “actually, this load almost always hits L1, so we can give it a delay of 3 to start this following load earlier / to better pack instructions”.

This reply was modified 1 year, 6 months ago by NarrateurDuChaos.

Reply To: Grab bag of questions