Difference between revisions of "Speculation"
Line 23: | Line 23: | ||
The Mill has devised a few ways to avoid the unwanted side effects, which means far fewer of the branches in a program are hard barriers to <abbr title="Instruction Level Parallelism">ILP</abbr>. [[Phasing]] is one of the ways. Software [[Pipelining]] of loops also makes extensive use of the <abbr title="Not a Result">NaR</abbr> and the None [[Metadata]] tags for this purpose. | The Mill has devised a few ways to avoid the unwanted side effects, which means far fewer of the branches in a program are hard barriers to <abbr title="Instruction Level Parallelism">ILP</abbr>. [[Phasing]] is one of the ways. Software [[Pipelining]] of loops also makes extensive use of the <abbr title="Not a Result">NaR</abbr> and the None [[Metadata]] tags for this purpose. | ||
− | Speculation increases <abbr title="Instruction Level Parallelism">ILP</abbr> across branch boundaries independently of loops. <abbr title="Not a Result">NaR</abbr> and None and [[Instruction_Set/Pick|pick]] enable if-conversion on a massive scale on the Mill, removing branches altogether from the code by utilizing (meta)data flow instead of control flow. And even without [[Instruction_Set/Pick|pick]], but with [[Execution# | + | Speculation increases <abbr title="Instruction Level Parallelism">ILP</abbr> across branch boundaries independently of loops. <abbr title="Not a Result">NaR</abbr> and None and [[Instruction_Set/Pick|pick]] enable if-conversion on a massive scale on the Mill, removing branches altogether from the code by utilizing (meta)data flow instead of control flow. And even without [[Instruction_Set/Pick|pick]], but with [[Execution#Multi-Branch|parallel branches]] or with [[Gangs#Condition_Codes|condition codes]]the ILP is greatly increased here, too. |
=== Speculation vs. [[Prediction]] === | === Speculation vs. [[Prediction]] === |
Revision as of 22:58, 5 August 2014
Speculation preemptively does computing work you are not really sure you will need, so that by the time you are sure it is already done.
Contents
None and NaR
The problem is, often the work you try to do prematurely can clobber all the actual work you are doing, and can get in the way whenever there is any shared state. So the more you avoid shared state, the more you can do in parallel without getting in each others way.
The Mill already does a lot in this regard by having SSA semantics on the Belt. This works great for proper data values. Conventional architectures tend to have error and condition codes as global shared state though. Metadata to the rescue. In particular None and NaR and the floating point status flags.
Speculable and Realizing Operations
By far the most of the operations in the Mill instruction set can be speculated. What this means is, if an operand to the operation is None or Nar, all the operation does is to make the result None or NaR, or combine the status flags in the case of floating point operations in the result. This is even true for the load operation.
Only when values are put into shared system state it becomes relevant whether the values are valid or not. This is when those values become realized. There are only a handful of instructions that realize values, in particlar store and branches. They all are in the writer phase.
Pick
The pick operation is a special beast. It has the semantics of the C ?: operator and it has zero latency and 3 operands. All this is only really possible, and cheaply possible, because it doesn't actually need a functional unit. It is implemented in the renaming of Belt locations at the cycle boundary. And it is speculable, in contrast to true branches. With those attributes it can replace a lot of conditional branches, and tends to be the operation that picks which of all the speculatively computed values are passed on to be realized.
Rationale
Speculation is one of the few areas where the Mill favours higher energy consumption, because the performance gains are so great. Unneeded computation still costs energy, but since the Mill architecure is very wide issue and has a lot of functional units, it can exploit a lot of instrucion level parallelism. It saves time. In general purpose code the problem usually is to find that much ILP, because there are so many branches. Most branches only exist to avoid unwanted side effects under certain circumstances.
The Mill has devised a few ways to avoid the unwanted side effects, which means far fewer of the branches in a program are hard barriers to ILP. Phasing is one of the ways. Software Pipelining of loops also makes extensive use of the NaR and the None Metadata tags for this purpose.
Speculation increases ILP across branch boundaries independently of loops. NaR and None and pick enable if-conversion on a massive scale on the Mill, removing branches altogether from the code by utilizing (meta)data flow instead of control flow. And even without pick, but with parallel branches or with condition codesthe ILP is greatly increased here, too.
Speculation vs. Prediction
Some might ask what the difference is between speculation and prediction. Those concepts are only superficially connected. While both try to avoid stalls due to branches in the execution pipelines, both go about it very differently.
Speculation eliminates branches by going down all paths of execution and choosing the right result after the fact. This generally only works for relatively short and small differences in the paths and somewhat similar code, but can cover many paths all at once on wide issue machines and avoid all latencies.
Prediction chooses one path, and tries to become as good as possible at choosing the correct one. There is no excess computation work done here. But a wrong guess means a penalty of idleness for several cycles, usually 5 cycles on the Mill. With a really wrong guess this can become a full memory latency penalty in rare cases. This works for long paths and very different code down the different paths too.
Media
Presentation on Metadata and Speculation by Ivan Godard - Slides