Can someone please do a diagram showing cycle by cycle the status of mill pipeline when a typical call happens and returns? That isn’t very clear to me. When the belt renaming happens, and do different phases see different belt numbers?
From my understanding, the existence of the reading phase is by itself bad for performance, unlike the subsequent ones. It sure is a ingenuous way to enable denser instruction encoding, by allowing more complex instruction encoding and also by reducing the number of instructions encoded, avoiding the fixed overhead of an extra instruction. But save maybe by the interaction with a function call, I don’t see where else there can be a gain.
The abstract Mill instruction encoding format allows a lot of flexibility for the hardware and physical instruction set beneath to change and still be compatible with the abstract Mill. If for a certain Mill member the implementation trade-offs change, and it makes sense to encode both reader and computing phase operations together to be decoded and executed in the same cycle, would this still be a Mill? Or the reading phase executing a cycle earlier is a hardwired assumption that would break compatibility if changed?