Forum Topic: Pipelining
Talk by Ivan Godard – 2014-07-14 at Facebook
NOTE: the slides require genuine Microsoft PowerPoint to view; open source PowerPoint clones are unable to show the animations, which are essential to the slide content. If you do not have access to PowerPoint then watch the video, which shows the slides as intended.
Slides: pipelining.06 (.pptx)
Software pipelining on the Mill CPU:
Instant pipeline: add loop, no stirring needed
The Mill CPU architecture is very wide, able to issue and execute 30+independent MIMD operations per cycle. Non-looping open code often cannot use this raw compute capacity, but fortunately >80% of cycles are in loops. Loops potentially have unbounded instruction-level parallelism and can absorb all the capacity available – if the loop can be pipelined.
This talk addresses how loops are pipelined on the Mill architecture. On a conventional machine, pipelining requires lengthy prelude and postlude instruction sequences to get the pipeline started and wound down, frequently destroying the benefit of pipelining the main body; conventional pipelining can be of negative benefit on short loops, especially “while” type loops whose length is unknown and data dependent. Not so on a Mill: Mill pipelines have neither prelude nor postlude, and early conditional exit has no added cost.
Pipelines on conventional machines also have problems with loop-carried data, values produced by one iteration but consumed by another. Conventional code must resort to bucket-brigade register copies, or fail to pipeline altogether. Even architectures like the Itanium, which have special hardware for support, provide it only for the innermost loop. In contrast, the Mill needs no copies and can pipeline outer as well as inner loops.
Familiarity with prior talks in this series, especially the Belt and Metadata talks will be helpful but not essential.
Speaker bio
Ivan Godard has designed, implemented or led the teams for 11 compilers for a variety of languages and targets, an operating system, an object-oriented database, and four instruction set architectures. He participated in the revision of Algol68 and is mentioned in its Report, was on the Green team that won the Ada language competition, designed the Mary family of system implementation languages, and was founding editor of the Machine Oriented Languages Bulletin. He is a Member Emeritus of IFIPS Working Group 2.4 (Implementation languages) and was a member of the committee that produced the IEEE and ISO floating-point standard 754-2011.
Ivan is currently CTO at Mill Computing, a startup now emerging from stealth mode. Mill Computing has developed the Mill, a clean-sheet rethink of general-purpose CPU architectures. The Mill is the subject of this talk.