Forum Topic: The Belt
Talk by Ivan Godard – 2013-07-11 at Google
NOTE: the slides require genuine Microsoft PowerPoint to view; open source PowerPoint clones are unable to show the animations, which are essential to the slide content. If you do not have access to PowerPoint then watch the video, which shows the slides as intended.
Slides: 2013-07-11_mill_cpu_belt (.pptx)
Belt Machines
Data interchange without general registers
A large fraction of the power budget of modern superscalar CPUs is devoted to renaming registers: the CPU must track the dataflow of the executing program, assign physical registers and map them to the logical registers of the program, schedule operations when arguments are available, restore visible state in the event of an exception—all while avoiding register update hazards.
Not all CPU architectures are subject to hazards that require register renaming. Unfortunately, earlier hazard-free designs either require one-at-a-time instruction execution (stack and accumulator machines) or push hazard avoidance off onto the compiler or programmer (VLIWs). The Mill is a new machine architecture that eliminates these problems by adopting a new machine model, the “belt”.
The belt machine model is inherently free of update hazards because all operation results go onto the belt by Single Assignment; in other words, once created they never change their value. Belt machines have no general registers and thus no rename registers that physically embody them. Result addressing is implicit, which produces compact code and easily accommodates operations like integer divide that logically produce multiple results. The machine model integrates naturally with function call, eliminating caller/callee save conventions and complex call preamble and postamble code.
A belt machine has short pipelines because it lacks the extra pipe stages associated with rename; typical misprediction penalty is five cycles (if decode is also fast). Area and power consumption in a belt core is a third that of an equivalent superscalar in large part because a belt lacks the large number of physical rename registers and the interconnect needed to supply register values to the functional units.
The talk explains the belt model as seen by the programmer and the physical internals of a typical implementation.
Speaker bio
Ivan Godard has designed, implemented or led the teams for 11 compilers for a variety of languages and targets, an operating system, an object-oriented database, and four instruction set architectures. He participated in the revision of Algol68 and is mentioned in its Report, was on the Green team that won the Ada language competition, designed the Mary family of system implementation languages, and was founding editor of the Machine Oriented Languages Bulletin. He is a Member Emeritus of IFIPS Working Group 2.4 (Implementation languages) and was a member of the committee that produced the IEEE and ISO floating-point standard 754-2011.
Ivan is currently CTO at Mill Computing, a startup now emerging from stealth mode. Mill Computing has developed the Mill, a clean-sheet rethink of general-purpose CPU architectures. The Mill is the subject of this talk.