Mill Computing, Inc. Forums The Mill Architecture Introduction to the Mill CPU Programming Model Reply To: Introduction to the Mill CPU Programming Model

Ivan Godard
Keymaster
Post count: 689

Control-flow join points must normalize the belt so that each live operand is in the same place regardless of how the join was reached. The best normalization code depends on the number of live values and their respective positions.

The completely general solution is to choose the belt layout of one inbound path as canonical, and add a conform operation to each other path to force the belt on that path to match (conform with) the canonical. If you have profile info then the canonical path should be the one most commonly taken.

If the values to be normalized are in the same relative order on all paths (although not the same actual positions) then putting a rescue operation on each path gives a cheaper encoding than conform.

If the code between the dominator and the join on each path is speculable (as in your example) then if-conversion will give by far the best code, replacing the branch with one or more pick operations. In modern fabrication processes it cost little more to do a wasted arithmetic operation than to leave the functional unit idle for that cycle. However, if there is more code than available FU cycles this speculation strategy will increase latency, so using pick is most advantageous on wide family members. However, overall program performance may still be better with pick if the branch does not have consistently predictable behavior; the speculation latency of the pick will be exceeded by the mispredict penalties of the branch.

Lastly, in some cases (including your example) it is possible to do a poor-man’s conform using a null arithmetic operation, typically ORing with an immediate zero. Then if the code takes the true path there will be a result dropped by the add, and if it takes the false path then a copy of the value will be dropped by the OR, and in either case the correct value will be at the front of the belt for further processing.

The specializer algorithm currently if-converts everything if the operations are speculable, and it inserts conform ops if not. A later peep-hole step then changes the conforms to rescues if that is possible.