Something interesting that I don’t think you’ve yet addressed yet it’s how to deal with variable latency instructions (aside from loads). For example, floating point ops can be orders of magnitude slower than usual when the operands are subnormals. How does the mill deal with these sorts of corner cases, besides stalling? Does it use the same method as loads, or something else?
Reply To: Metadata thebellster