Microcode give me hives 🙂
Microcode is a bad fit with any wide-issue machine. A micro-copy routine won’t saturate the width (it’s just a loop, and internal dependencies will keep you from using all the FUs). When the same loop is macrocode then other app code can be scheduled to overlap with at least some of the ops of the copy, giving better utilization.
Most people (Linus included I expect) who prefer micro over a function call or in-line macrocode do so because they expect that a call is expensive and the in-line code will consume architectural resources such as registers, while the micro is cheap to get into and runs in the micro state so it doesn’t collide with app state.
Guess what: Mill call are cheap and give you a whole private state that doesn’t collide either.
We’re not opposed to putting complex behavior in the hardware, for which the implementation will look a whole lot like a small programmable device: the Spiller is a glaring example, and the memory controller and the exit prediction machinery are too, and there are more NYF. But these are free-standing engines that are truly asynchronous to the main core execution and do not contend for resources with the application. But classical microcode? No, our microcode is macrocode.