Thanks for the reply.
Static scheduling will only get you so far. If you have one load per loop iteration with a medium loop body, you can’t really schedule 10 loads up-front without unrolling a lot (i.e. each load must somehow be associated with instructions to process it). There are also more complex scenarios where discoverability would be useful (e.g. loads-on-loads).
For the load/store aliases, things are indeed interesting. I especially like the mechanism that Mill uses to prevent false sharing within a cache line.
As requested, I won’t further discuss what Mill could do. Note though that enthusiasts typically like to discuss new architectures openly, in this forum or elsewhere. The idea proposed by png (IP on the forums) to include a standard blurb in a post to prevent patent issues might be worth investigating. To me, that seems preferable than asking people to voluntarily censor themselves.