Mill Computing, Inc

Keymaster

October 16, 2014 at 6:08 pm

Post count: 689

Exactly right.

When the prediction facility was being designed, we decided that any unstable transfer point such as method calls (what Cliff Click calls “megamorphic”) would be of interest only if control passed through it frequently enough to keep the various target EBBs in the L2. Consequently we did not concern ourselves with issues of DRAM latency, and were primarily concerned with the miss-predict penalty (yes, the Mill has a penalty that is a third that of other architectures, but a stall is still a stall) and the L1/L2 latencies.

Next, we observed that the codes we looked at, and in our personal experience, megamorphic transfers tended to not be calls on leaf function. Instead, a megamorphic transfer usually led to a sequence of transfers that were not megamorphic, until control returned back through the megamorphic point. This reflects a program model in which a test point figures out the type of a datum, and then does stuff with the known type. Mill run-ahead prediction works well with this model, because once through the megamorph the subsequent control flow will predict well and will be prefetched ahead of need. Incidentally, the original source may not have been written to this model, but the compiler’s value-set analysis and inlining will tend to remove the type-checks internally to the known-type cluster, and even if it does not, the type will in fact be fixed and the transfers not be megamorphic.

The alternative model works less well, where the same object is subjected to a succession of leaf-function calls. Each of the calls will be megamorphic if the object is (and so will miss-predict frequently), and the leaf functions don’t have long enough transfer chains for run-ahead to win much.

This suggests an obvious optimization for the compiler: transform the latter structure into the former. This dimensionality inversion or transpose is the control space analogy to the re-tiling optimizations that are used to improve array traversal in the data space, and for all I know some compilers already do it. It should benefit any prediction mechanism, not just ours.

Reply To: Prediction