Software Pipelining (Talk: July 14 2014)

Author
Posts
Will_Edwards
Moderator
July 8, 2014 at 2:00 pm
Post count: 98
#1168 |
Ivan Godard, CTO of Mill Computing, Inc., will be giving a talk at Facebook.
The particulars:
Monday, July 14, 2014
Doors open at 10:30 AM, Talk is from 11 AM to 12:30 PM
1 Hacker Way, Bldg 10
Menlo Park, CA 94025
Enter via the left lane of the Willow Road entrance. There is visitor parking along the front of Building 10, but if all of those are taken there is overflow parking across the street. Guests should come to Building 10 and sign in. They will then be escorted to Room 11.2. Our hosts at Facebook are Edwin Smith and Jason Evans.
This will be the ninth topic publicly presented on the Mill general-purpose CPU architecture. It will cover the methods used to perform software pipelining on the Mill Architecture. The talk will assume some general familiarity with software pipelining.
Software pipelining on the Mill CPU:
Instant pipeline: add loop, no stirring needed
The Mill CPU architecture is very wide, able to issue and execute 30+independent MIMD operations per cycle. Non-looping open code often cannot use this raw compute capacity, but fortunately >80% of cycles are in loops. Loops potentially have unbounded instruction-level parallelism and can absorb all the capacity available – if the loop can be pipelined.
This talk addresses how loops are pipelined on the Mill architecture. On a conventional machine, pipelining requires lengthy prelude and postlude instruction sequences to get the pipeline started and wound down, frequently destroying the benefit of pipelining the main body. Conventional pipelining can be of negative benefit on short loops, especially “while” type loops whose length is unknown and data dependent. Not so on a Mill: Mill pipelines have neither prelude nor postlude, and early conditional exit has no added cost.
Pipelines on conventional machines also have problems with loop-carried data, values produced by one iteration but consumed by another. Conventional code must resort to bucket-brigade register copies, or fail to pipeline altogether. Even architectures like the Itanium, which have special hardware to support pipelining, provide it only for the innermost loop. In contrast, the Mill needs no copies and can pipeline outer as well as inner loops.
Familiarity with prior talks in this series, especially the Belt and Metadata talks will be helpful but not essential.
- This topic was modified 11 years, 7 months ago by staff.
Findecanor
Participant
July 19, 2014 at 12:10 pm
Post count: 37
#1192
I hope that the talk was filmed. When would the talk be up on Youtube?
Will_Edwards
Moderator
July 19, 2014 at 1:07 pm
Post count: 98
#1194
It was filmed.
Post production is quite an involved process, but is nearly finished.
We’ll post the video soon on the Architecture forum with the other talks, and announce it on the mailing list.
Subscribe to the mailing list or keep checking back manually.
David McCammond-Watts
Participant
July 29, 2014 at 3:38 pm
Post count: 13
#1209
Any ETA on the video?
Author
Posts

You must be logged in to reply to this topic.