Difference between revisions of "Instruction Set/load"
(4 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
{{DISPLAYTITLE:load}} | {{DISPLAYTITLE:load}} | ||
− | <div style="font-size:80%;line-height:90%;margin-bottom:2em">[[Speculation| | + | <div style="font-size:80%;line-height:90%;margin-bottom:2em">[[Speculation|speculable]] [[Encoding|flow stream]] [[Decode|flow block]] [[Phasing|writer phase]] operation [[Domains|in the logical value domain]] <br /> |
'''native on:''' [[Cores|all]]<br /> | '''native on:''' [[Cores|all]]<br /> | ||
</div> | </div> | ||
− | load from memory | + | Schedule data load from memory address into belt. |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | The load operation is a central piece of the Mill architecture. Due to being a statically scheduled instruction set, all operations have a known constant latency to do the scheduling with. This approach is not possible for loads, since loads, by their very nature, depend on the memory hierarchy, and take different times depending on from where in the cache hierarchy the value had to be pulled.<br /> | |
− | + | When a value is pulled all the way from memory scheduling doesn't matter, since the machine stalls anyway, but being able to make the cache delays predictable, would remove the last obstacle to make all operations [[static scheduling|statically schedulable]]. | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | This is achieved by adding an additional parameter to all load operations: the delay that specifies the latency and defines when the load is retiring and dropping the acquired value on the belt. This way loads may have no constant, but still a known defined latency for each invocation. | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | A second option for scheduling loads is using tags. This is useful when you might wanna use the same loaded value in two different branches, but you need it in two different cycles in the two branches, or you may not need it at all in one branch. Instead of just dropping the value on the belt after the delay you have to explicitly retrieve or refuse the value from the [[Retire Station]] with the [[Instruction Set/pickup|pickup]] or [[Instruction Set/refuse|refuse]] operation.<br /> | |
− | + | This enables speculative loads. | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | - | + | Loads are aliasing-safe. This means the value load returns is the value that is at the address at the time load retires, not at the time load is issues. Any stores to the address of an in flight load are tracked and reflected in the result. |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | There are several different addressing modes for loads. The general formula for computing addresses is | |
− | + | <code>base+offset+(scale*index)</code>.<br /> | |
− | + | Base can come from a number of special [[Registers]] or the belt. Offset is always an inline constant. Those two are always present, although a zero offset doesn't take any space at all.<br /> | |
− | + | Scale and index are optional and alway appear together. The scale is a compile time constant, the index is always from the belt. | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | Another compile time parameter is the width and scalarity of the loaded value. The minimum is one byte, but if the [[Cores|Core]] allows it, it is very possible to directly load a vector of eight 32bit elements. | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | <b>related operations:</b> [[Instruction_Set/store|store]], [[Instruction_Set/pickup|pickup]], [[Instruction_Set/refuse|refuse]], [[Instruction_Set/loadf|loadf]], [[Instruction_Set/loadd|loadd]] | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
---- | ---- | ||
− | <code style="font-size:130%"><b style="color:#050">load</b>(<span style="color:#009">[[Sources#base|base]]</span> <span title="base special register"> | + | <code style="font-size:130%"><b style="color:#050">load</b>(<span style="color:#009">[[Sources#base|base]]</span> <span title="base special register">base0</span>, <i><span style="color:#009">[[Immediates#off|off]]</span> <span title="manifest constant">off0</span></i>, <i><span style="color:#009">[[Immediates#width|width]]</span> <span title="data width and vector length (flow)">width0</span></i>, )</code> |
− | <div style="font-size:80%;line-height:90%;margin-bottom:2em">'''operands:''' [[Operands# | + | <div style="font-size:80%;line-height:90%;margin-bottom:2em">'''operands:''' [[Operands#likeStore|like Store px:]] |
</div> | </div> | ||
<br /> | <br /> | ||
Line 109: | Line 36: | ||
| [[Cores/Tin/Encoding#load|Tin]] || F0 || 3 | | [[Cores/Tin/Encoding#load|Tin]] || F0 || 3 | ||
|- | |- | ||
− | | [[Cores/Copper/Encoding#load|Copper]] || F0 | + | | [[Cores/Copper/Encoding#load|Copper]] || F0 || 3 |
|- | |- | ||
− | | [[Cores/Silver/Encoding#load|Silver]] || F0 F1 F2 | + | | [[Cores/Silver/Encoding#load|Silver]] || F0 F1 F2 || 3 |
|- | |- | ||
− | | [[Cores/Gold/Encoding#load|Gold]] || F0 | + | | [[Cores/Gold/Encoding#load|Gold]] || F0 || 3 |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
|} | |} | ||
---- | ---- | ||
− | <code style="font-size:130%"><b style="color:#050">load</b>(<span style="color:#009">[[Sources#base|base]]</span> <span title="base special register">b</span>, <i><span style="color:#009">[[Immediates#off|off]]</span> <span title="manifest constant">o</span></i>, <i><span style="color:#009">[[Immediates#width|width]]</span> <span title="data width and | + | <code style="font-size:130%"><b style="color:#050">load</b>(<span style="color:#009">[[Sources#base|base]]</span> <span title="base special register">b</span>, <i><span style="color:#009">[[Immediates#off|off]]</span> <span title="manifest constant">o</span></i>, <i><span style="color:#009">[[Immediates#width|width]]</span> <span title="data width and vector length (flow)">w</span></i>, <span style="color:#009">[[Sources#tag|tag]]</span> <span title="name |
− | + | for a speculative load (in load)">tag</span>)</code> | |
− | <div style="font-size:80%;line-height:90%;margin-bottom:2em">'''operands:''' [[Operands# | + | <div style="font-size:80%;line-height:90%;margin-bottom:2em">'''operands:''' [[Operands#likeStore|like Store px:]] |
</div> | </div> | ||
<br /> | <br /> | ||
Line 132: | Line 55: | ||
| [[Cores/Tin/Encoding#load|Tin]] || F0 || 3 | | [[Cores/Tin/Encoding#load|Tin]] || F0 || 3 | ||
|- | |- | ||
− | | [[Cores/Copper/Encoding#load|Copper]] || F0 | + | | [[Cores/Copper/Encoding#load|Copper]] || F0 || 3 |
|- | |- | ||
− | | [[Cores/Silver/Encoding#load|Silver]] || F0 F1 F2 | + | | [[Cores/Silver/Encoding#load|Silver]] || F0 F1 F2 || 3 |
|- | |- | ||
− | | [[Cores/Gold/Encoding#load|Gold]] || F0 | + | | [[Cores/Gold/Encoding#load|Gold]] || F0 || 3 |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
|} | |} | ||
---- | ---- | ||
− | <code style="font-size:130%"><b style="color:#050">load</b>(<span style="color:#009">[[Domains# | + | <code style="font-size:130%"><b style="color:#050">load</b>(<span style="color:#009">[[Domains#op|op]]</span> <span title="belt operand from opsWindow">op0</span>, <i><span style="color:#009">[[Immediates#off|off]]</span> <span title="manifest constant">off0</span></i>, <i><span style="color:#009">[[Immediates#width|width]]</span> <span title="data width and vector length (flow)">width0</span></i>, )</code> |
− | + | <div style="font-size:80%;line-height:90%;margin-bottom:2em">'''operands:''' [[Operands#likeStore|like Store px:]] | |
− | <div style="font-size:80%;line-height:90%;margin-bottom:2em">'''operands:''' [[Operands# | + | |
</div> | </div> | ||
<br /> | <br /> | ||
Line 155: | Line 73: | ||
| [[Cores/Tin/Encoding#load|Tin]] || F0 || 3 | | [[Cores/Tin/Encoding#load|Tin]] || F0 || 3 | ||
|- | |- | ||
− | | [[Cores/Copper/Encoding#load|Copper]] || F0 | + | | [[Cores/Copper/Encoding#load|Copper]] || F0 || 3 |
|- | |- | ||
− | | [[Cores/Silver/Encoding#load|Silver]] || F0 F1 F2 | + | | [[Cores/Silver/Encoding#load|Silver]] || F0 F1 F2 || 3 |
|- | |- | ||
− | | [[Cores/Gold/Encoding#load|Gold]] || F0 | + | | [[Cores/Gold/Encoding#load|Gold]] || F0 || 3 |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
|} | |} | ||
---- | ---- | ||
− | <code style="font-size:130%"><b style="color:#050">load</b>(<span style="color:#009">[[Domains#p|p]]</span> <span title="belt operand from opsWindow">b</span>, <i><span style="color:#009">[[Immediates#off|off]]</span> <span title="manifest constant">o | + | <code style="font-size:130%"><b style="color:#050">load</b>(<span style="color:#009">[[Domains#p|p]]</span> <span title="belt operand from opsWindow">b</span>, <i><span style="color:#009">[[Immediates#off|off]]</span> <span title="manifest constant">o</span></i>, <i><span style="color:#009">[[Immediates#width|width]]</span> <span title="data width and vector length (flow)">w</span></i>, <span style="color:#009">[[Sources#tag|tag]]</span> <span title="name for a speculative load (in load)">tag</span>)</code> |
− | + | <div style="font-size:80%;line-height:90%;margin-bottom:2em">'''operands:''' [[Operands#likeStore|like Store px:]] | |
− | + | ||
− | <div style="font-size:80%;line-height:90%;margin-bottom:2em">'''operands:''' [[Operands# | + | |
</div> | </div> | ||
<br /> | <br /> | ||
Line 179: | Line 91: | ||
| [[Cores/Tin/Encoding#load|Tin]] || F0 || 3 | | [[Cores/Tin/Encoding#load|Tin]] || F0 || 3 | ||
|- | |- | ||
− | | [[Cores/Copper/Encoding#load|Copper]] || F0 | + | | [[Cores/Copper/Encoding#load|Copper]] || F0 || 3 |
|- | |- | ||
− | | [[Cores/Silver/Encoding#load|Silver]] || F0 F1 F2 | + | | [[Cores/Silver/Encoding#load|Silver]] || F0 F1 F2 || 3 |
|- | |- | ||
− | | [[Cores/Gold/Encoding#load|Gold]] || F0 | + | | [[Cores/Gold/Encoding#load|Gold]] || F0 || 3 |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
|} | |} | ||
---- | ---- | ||
− | <code style="font-size:130%"><b style="color:#050">load</b>(<span style="color:#009">[[Domains# | + | <code style="font-size:130%"><b style="color:#050">load</b>(<span style="color:#009">[[Domains#op|op]]</span> <span title="belt operand from opsWindow">op0</span>, <i><span style="color:#009">[[Immediates#width|width]]</span> <span title="data width and vector length (flow)">memAttr0</span></i>, <i><span style="color:#009">[[Immediates#memAttr|memAttr]]</span> <span title="special load/store |
− | + | behaviors">off0</span></i>, )</code> | |
− | + | <div style="font-size:80%;line-height:90%;margin-bottom:2em">'''operands:''' [[Operands#likeStore|like Store px:]] | |
− | <div style="font-size:80%;line-height:90%;margin-bottom:2em">'''operands:''' [[Operands# | + | |
</div> | </div> | ||
<br /> | <br /> | ||
Line 203: | Line 110: | ||
| [[Cores/Tin/Encoding#load|Tin]] || F0 || 3 | | [[Cores/Tin/Encoding#load|Tin]] || F0 || 3 | ||
|- | |- | ||
− | | [[Cores/Copper/Encoding#load|Copper]] || F0 | + | | [[Cores/Copper/Encoding#load|Copper]] || F0 || 3 |
|- | |- | ||
− | | [[Cores/Silver/Encoding#load|Silver]] || F0 F1 F2 | + | | [[Cores/Silver/Encoding#load|Silver]] || F0 F1 F2 || 3 |
|- | |- | ||
− | | [[Cores/Gold/Encoding#load|Gold]] || F0 | + | | [[Cores/Gold/Encoding#load|Gold]] || F0 || 3 |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
|} | |} | ||
---- | ---- | ||
− | <code style="font-size:130%"><b style="color:#050">load</b>(<span style="color:#009">[[Domains#p|p]]</span> <span title="belt operand from opsWindow">b</span>, <i><span style="color:#009">[[Immediates# | + | <code style="font-size:130%"><b style="color:#050">load</b>(<span style="color:#009">[[Domains#p|p]]</span> <span title="belt operand from opsWindow">b</span>, <i><span style="color:#009">[[Immediates#width|width]]</span> <span title="data width and vector length (flow)">w</span></i>, <i><span style="color:#009">[[Immediates#memAttr|memAttr]]</span> <span title="special load/store |
− | <div style="font-size:80%;line-height:90%;margin-bottom:2em">'''operands:''' [[Operands# | + | behaviors">m</span></i>, <span style="color:#009">[[Sources#tag|tag]]</span> <span title="name for a speculative load (in load)">tag</span>)</code> |
+ | <div style="font-size:80%;line-height:90%;margin-bottom:2em">'''operands:''' [[Operands#likeStore|like Store px:]] | ||
</div> | </div> | ||
<br /> | <br /> | ||
Line 225: | Line 129: | ||
| [[Cores/Tin/Encoding#load|Tin]] || F0 || 3 | | [[Cores/Tin/Encoding#load|Tin]] || F0 || 3 | ||
|- | |- | ||
− | | [[Cores/Copper/Encoding#load|Copper]] || F0 | + | | [[Cores/Copper/Encoding#load|Copper]] || F0 || 3 |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
|- | |- | ||
− | | [[Cores/ | + | | [[Cores/Silver/Encoding#load|Silver]] || F0 F1 F2 || 3 |
|- | |- | ||
− | | [[Cores/ | + | | [[Cores/Gold/Encoding#load|Gold]] || F0 || 3 |
|} | |} | ||
[[Instruction_Set|Instruction Set, alphabetical]], [[Instruction Set by Category]], [http://millcomputing.com/instructions.html?collapse=7#ops Instruction Set, sortable, filterable] | [[Instruction_Set|Instruction Set, alphabetical]], [[Instruction Set by Category]], [http://millcomputing.com/instructions.html?collapse=7#ops Instruction Set, sortable, filterable] |
Latest revision as of 14:00, 23 February 2021
Schedule data load from memory address into belt.
The load operation is a central piece of the Mill architecture. Due to being a statically scheduled instruction set, all operations have a known constant latency to do the scheduling with. This approach is not possible for loads, since loads, by their very nature, depend on the memory hierarchy, and take different times depending on from where in the cache hierarchy the value had to be pulled.
When a value is pulled all the way from memory scheduling doesn't matter, since the machine stalls anyway, but being able to make the cache delays predictable, would remove the last obstacle to make all operations statically schedulable.
This is achieved by adding an additional parameter to all load operations: the delay that specifies the latency and defines when the load is retiring and dropping the acquired value on the belt. This way loads may have no constant, but still a known defined latency for each invocation.
A second option for scheduling loads is using tags. This is useful when you might wanna use the same loaded value in two different branches, but you need it in two different cycles in the two branches, or you may not need it at all in one branch. Instead of just dropping the value on the belt after the delay you have to explicitly retrieve or refuse the value from the Retire Station with the pickup or refuse operation.
This enables speculative loads.
Loads are aliasing-safe. This means the value load returns is the value that is at the address at the time load retires, not at the time load is issues. Any stores to the address of an in flight load are tracked and reflected in the result.
There are several different addressing modes for loads. The general formula for computing addresses is
base+offset+(scale*index)
.
Base can come from a number of special Registers or the belt. Offset is always an inline constant. Those two are always present, although a zero offset doesn't take any space at all.
Scale and index are optional and alway appear together. The scale is a compile time constant, the index is always from the belt.
Another compile time parameter is the width and scalarity of the loaded value. The minimum is one byte, but if the Core allows it, it is very possible to directly load a vector of eight 32bit elements.
related operations: store, pickup, refuse, loadf, loadd
load(base base0, off off0, width width0, )
Core | In Slots | Latencies |
---|---|---|
Tin | F0 | 3 |
Copper | F0 | 3 |
Silver | F0 F1 F2 | 3 |
Gold | F0 | 3 |
load(base b, off o, width w, tag tag)
Core | In Slots | Latencies |
---|---|---|
Tin | F0 | 3 |
Copper | F0 | 3 |
Silver | F0 F1 F2 | 3 |
Gold | F0 | 3 |
load(op op0, off off0, width width0, )
Core | In Slots | Latencies |
---|---|---|
Tin | F0 | 3 |
Copper | F0 | 3 |
Silver | F0 F1 F2 | 3 |
Gold | F0 | 3 |
load(p b, off o, width w, tag tag)
Core | In Slots | Latencies |
---|---|---|
Tin | F0 | 3 |
Copper | F0 | 3 |
Silver | F0 F1 F2 | 3 |
Gold | F0 | 3 |
load(op op0, width memAttr0, memAttr off0, )
Core | In Slots | Latencies |
---|---|---|
Tin | F0 | 3 |
Copper | F0 | 3 |
Silver | F0 F1 F2 | 3 |
Gold | F0 | 3 |
load(p b, width w, memAttr m, tag tag)
Core | In Slots | Latencies |
---|---|---|
Tin | F0 | 3 |
Copper | F0 | 3 |
Silver | F0 F1 F2 | 3 |
Gold | F0 | 3 |
Instruction Set, alphabetical, Instruction Set by Category, Instruction Set, sortable, filterable