- Will_EdwardsModeratorJanuary 29, 2014 at 11:51 pmPost count: 98
This will be the sixth topic publicly presented on the Mill general-purpose CPU architecture. It will cover only the instruction execution portion of the design. The talk will assume a familiarity with pipelines in modern CPUs but not with any particular instruction set architecture.
Instruction execution on the Mill CPU
Working at Mach 3
A perennial objection to wide-issue CPU architectures such as VLIWs and the Mill is that there is insufficient instruction level parallelism (ILP) in programs to make effective use of the available functional width. While software pipelining can reveal large quantities of ILP in loops, in open (non-loop) code studies have calculated maximum ILP in the order of two instructions per cycle (IPC), well below the capacity of even conventional VLIWs never mind super-wide architectures such as high-end Mills. The problem is that the program instructions tend to form chains connected by data dependencies, precluding executing them in parallel.
This talk addresses the ILP issue, describing how the Mill is able to achieve much higher IPC even when the nominal ILP is relatively low. The Mill is able to execute as many as six chained dependent operations in a single cycle; open code IPC numbers typically exceed nominal ILP by a factor of three. The talk will show in detail how this is achieved, and why we chose “Mach 3” as the name of the mechanism. In the course of the explanation, the talk will also introduce other operations for which the semantics of Mill execution differs from that of conventional CPUs.
- Will_EdwardsModeratorFebruary 3, 2014 at 6:29 amPost count: 98
Pacific Standard Time, PST 🙂
You must be logged in to reply to this topic.