The Mill is a new general-purpose CPU architectural family. The talk will present machine-level details of the Mill support for bigger-than-scalar data.

Most modern architectures have SIMD operations that work on vectors of data, in addition to the scalars used by all CPUs. Typically there is a limited assortment of vector operations, working on data of a limited set of element sizes, and often a limited set of sizes for the whole vector. To make matters worse for the programmer, the operands and element sizes available frequently vary with the whole-vector size, and succeeding models of the “same” architecture rarely offer the same vector facilities. As a result, vector codes must usually be written in assembler, and special-cased for each version of the target CPU.

The Mill has vector forms of all scalar operations, and all vector operations work for all the element sizes that are supported for scalar; the ISA is completely regular. In addition,  vectors may have any number of elements as seen by the program, while hardware parallelism is limited only by the number of functional units supporting the desired operation on the particular Mill family member. Unlike the power- and area-hungry vector registers used to hold intermediate vector results on a conventional CPU, the Mill has no vector registers (nor any general registers at all), but uses the Belt (a single-assignment forwarding network) for vectors as well as scalars.

While other CPUs support bigger data in the form of vectors, none have support for irregular data forms such as structs, records or objects. Instead, individual fields of such objects are treated as single scalars. The Mill contains several facilities that treat composites as unitary objects rather than as a collection of fields, significantly improving code and performance.