As a programmer of many decades the parts I like most about what I’ve learned about the mill are the innovations around memory. The implicit zero for stack frames is a thing of beauty. You get a strong guarantee of initialization that’s actually faster for the hardware.
Pushing the TLB to the periphery is also genius. A 64-bit address space really is a tremendous amount. We all “know” that statements like “640k is enough for anyone” are laughably short-lived, but that’s only a joke until it isn’t. If “enough is as good as a feast”, then an exponential number of feasts must truly be enough for anyone. That one restriction of living in a single 64-bit address space yields so many benefits if you take advantage of it in the right way. You just have to have a wider perspective and a willingness to rethink past decisions (i.e. separate address spaces for each process).
That’s just a few of my favorites (NaR-bit? Implicitly loading zero? Static scheduling with the performance of OoO?). There have been so many times when learning about the mill architecture that I’ve had that a-ha moment. Once a problem is framed correctly the solutions seem obvious. It reminds me of the joke about the two mathematicians who meet at a conference and are both working on the same problem. They get to talking in the bar at night and start working through it together. They work until the bar closes and then continue to work in the lobby until well into the next day. Finally, they make a break through and one says to the other “Oh! I see! It’s obvious!”