Being hardware, the Mill is not language-specific, but must support all common languages, common assembler idioms, and our best guess as to where those will evolve.
The best that we, or any design team, can do is to try to isolate primitive notions from the mass of semantic bundles and provide clean support for those primitives, letting each language compose its own semantic atoms from the primitives we supply.
The “volatile” notion from C is used for two purposes, with contradictory hardware requirements for any machine other than the native PDP-11 that is C. One purpose is to access active memory such as MMIO registers, for which the idempotency of usual memory access is violated and cache must not be used, nor can accesses be elided because of value reuse.
The other common purpose is to ensure that the actions of an asynchronous mutator (such as another core) are visible to the program. Here too it is important that accesses are not elided, but there is no need for an access to go to physical memory; they must only go to the level at which potentially concurrent accesses are visible. Where that level is located depends on whether there are any asynchronous mutators (there may be only one core, so checking for changes would be pointless) and the cache coherency structure (if any) among the mutators.
Currently the Mill distinguishes these two cases using different behavior flags in the load and store operations. One of those is the volatileData flag mentioned in the previous post. This flag ensures that the access is to the top coherent level, and bypasses any caching above that level. On a monocore all caches are coherent by definition, so on a monocore this flag is a no-op. It is also safe for the compiler to optimize out redundant access, because the data cannot change underneath the single core.
On a coherent multicore the flag is also a no-op to the hardware but is not a no-op to the compiler: the compiler must still encode all access operations, and must not elide any under the assumption that the value has not changed, because it might have.
On an incoherent multicore (common in embedded work), a volatileData access must proceed (without caching) to the highest shared level, which may be DRAM or may be a cache that is shared by all cores. Again, the compiler must not elide accesses.
For the other usage of C volatile keyword, namely MMIO, the access is to an active object and must reach whatever level is the residence of such objects. On the Mill that level is the aperture level, below all caches and immediately above the actual memory devices, and the flag noCache ensures that an access will reach that level. Again, the compiler must not elide noCache accesses.
Besides the noCache flag, the Mill also maps all of the physical address space to a fixed position (address zero) within the (much larger) shared virtual space. An access within that address range is still cached (unless it is noCache) but bypasses the TLB and its virtual address translation mechanism. NoCache addresses outside the physAddr region (and volatileData accesses if they get past the caches) are translated normally.
There are other access control flags, but that’s enough for now 🙂