This patent is for the split-stream method of encoding that permits Mill CPUs to decode extremely wide instructions (those with many independent operations in each instruction) using a compact and flexible variable-length encoding that is also fast to decode. The stream of instructions that is the program being executed is split into two streams of half-instructions that are stored separately in memory but are processed in lock-step by the CPU decoder. Because wide fixed length instruction encodings use an impractical amount of cache and memory, and variable-length encodings take time polynomial in the width, instruction decode on legacy CPUs is limited to eight operations per cycle or fewer; Mill split-stream encoding supports instruction widths of over thirty operations decoded per cycle. In addition, split-stream permits doubling the amount of instruction cache with no clock or pipeline penalty.
Split-stream encoding is described here in a way that is more accessible than the patent text.
U.S. Patent 9,513,921 – Computer Processor Employing Temporal Addressing For Storage of Transient Operands
This patent is for the Belt, the Mill CPU mechanism that replaces the function of the general registers used for temporary storage in legacy CPUs. Because the Belt entry is write-once-read-many, the Mill is immune the ordering hazards (WAW, RAW, WAR) that force use of massive numbers of rename registers in legacy CPUs. Removing the rename registers and their associated power-hungry circuitry leads to a more compact layout for better yield and lower cost, and saves the pipeline delay of the several stages devoted to rename translation.
The Belt is described here in a way that is more accessible than the patent text.
This patent is for the per-byte validation used in Mill caches. Each byte has an extra bit that indicates whether the data in that byte is valid, or must be found lower in the memory hierarchy. The valid-bits obviate the write buffers and consolidating buffers used by legacy CPUs that must update entire cache lines, a substantial saving in power, area and complexity. In addition, a Mill store operation takes effect at once and need not wait for a line to be read from external memory, so slow memory barrier operations are not needed by the Mill program.
The per-byte cache validation is described here in a way that is more accessible than the patent text.
This patent covers two different ways to use meta-information in pointer formats. In one use, each pointer carries a few “event” bits besides the target address of the pointer, which are checked against several mask registers in the CPU whenever executing a memory operation – load, store, or specifically storing a pointer; a match triggers a trap to application software or the runtime system. When set appropriately, the event bits speed up certain kinds of garbage collection and detect several kinds of security violations.
In the second use, the pointer format holds granularity information, and pointer arithmetic and array access operations can be checked by hardware for bounds violations without requiring memory tag bits or increasing the size of a pointer.