Hardware div units are expensive and don’t pipeline well. On a microcoded OOO box that doesn’t matter much, but the Mill doesn’t microcode and is in-order. Other machines also use macrocode after a seed value, so we knew it was possible and would fit with the rest of the machine regardless of the algorithm, so we left the details to arm-waving until later. The emulation codes are like that in general: obviously software emulation of e.g. floating-point is possible, but it’s a rat’s nest that needs a specialist, so the details were deferred until we had a specialist.
We now have those specialists (Terje and Allen, and now Larry), so the arm-waving is being turned into code. Right now the bottleneck is genAsm. Speaking of which, I better get back to work 🙂