Difference between revisions of "Instruction Set/call"
(Created page with "{{DISPLAYTITLE:call}} <div style="font-size:80%;line-height:90%;margin-bottom:2em">realizing flow stream Decode|flow block...") | |||
(4 intermediate revisions by the same user not shown) | |||
Line 5: | Line 5: | ||
The general purpose unconditional abstract call operations you will be using when writing [[GenAsm (language)|general assembly code]]. | The general purpose unconditional abstract call operations you will be using when writing [[GenAsm (language)|general assembly code]]. | ||
− | Depending on the number and order of arguments and the number of return values the assembler will pick the most dense from a myriad of different encodings. This can result in [[Ganging|gangs]] or it can be a short single operation. | + | Depending on the number and order of arguments and the number of return values the assembler will pick the most dense from a myriad of different encodings. This can result in [[Ganging|gangs]] or it can be a short single operation. For the gangs, that accommodate more arguments, there are encoding optimizations for common case of zero or one return values. |
+ | |||
+ | The call operation is one of the most important novel features of the Mill architecture. It almost completely abolishes calling conventions and function prologues and epilogues and memory references in calls. Almost all arguments are passed via belt the belt, certainly all native data types up to belt size are.<br /> | ||
+ | Things like varargs and big structures still need to be passed through the stack. | ||
+ | |||
+ | A call allocates a new belt in a new frame, initialized with the function arguments in order. | ||
+ | There can be a different call operation in each slot in an instruction that has a control functional unit if those slots are not used for gangs in call operation with more arguments. Those call operations are chained into each other, the return values being placed into the belt of the next call. This implements a hardware tail call mechanism without involving any memory. | ||
+ | |||
+ | The target of a function call doesn't necessarily have to be a an [[EBB]] that implements the function. It can also be a [[Portal]], i.e. it can initiate a full context switch. What the call hardware does depends on the [[Permissions]] on the address. | ||
+ | |||
+ | There are also conditional calls. Pretty handy considering how common predicated calls are. | ||
+ | |||
+ | Call operations always have to define how many return values there will be, so the decoder can work with the correct belt indices to route values to the [[FU]]s. | ||
+ | |||
+ | <b>related operations:</b> [[Instruction_Set/calltr|calltr]], [[Instruction_Set/callfl|callfl]], [[Instruction_Set/retn|retn]] | ||
+ | |||
---- | ---- | ||
<code style="font-size:130%"><b style="color:#050">call</b>(<i><span style="color:#009">[[Immediates#lit|lit]]</span> <span title="morsel-sized manifest constant">n</span></i>, <span style="color:#009">[[Domains#p|p]]</span> <span title="call argument from calls window">target</span>, <span style="color:#009">[[Domains#args|args]]</span> <span title="zero or more call | <code style="font-size:130%"><b style="color:#050">call</b>(<i><span style="color:#009">[[Immediates#lit|lit]]</span> <span title="morsel-sized manifest constant">n</span></i>, <span style="color:#009">[[Domains#p|p]]</span> <span title="call argument from calls window">target</span>, <span style="color:#009">[[Domains#args|args]]</span> <span title="zero or more call | ||
Line 25: | Line 40: | ||
! [[Cores|Core]] || [[Slot|In Slots]]|| [[Latency|Latencies]] | ! [[Cores|Core]] || [[Slot|In Slots]]|| [[Latency|Latencies]] | ||
|- | |- | ||
− | | [[Cores/Tin/Encoding# | + | | [[Cores/Tin/Encoding#call|Tin]] || F0 || 1 |
|- | |- | ||
− | | [[Cores/Copper/Encoding# | + | | [[Cores/Copper/Encoding#call|Copper]] || F0 F1 || 1 |
|- | |- | ||
− | | [[Cores/Silver/Encoding# | + | | [[Cores/Silver/Encoding#call|Silver]] || F0 F1 F2 || 1 |
|- | |- | ||
− | | [[Cores/Gold/Encoding# | + | | [[Cores/Gold/Encoding#call|Gold]] || F0 F1 F2 F3 || 1 |
|- | |- | ||
− | | [[Cores/Decimal8/Encoding# | + | | [[Cores/Decimal8/Encoding#call|Decimal8]] || F0 F1 F2 || 1 |
|- | |- | ||
− | | [[Cores/Decimal16/Encoding# | + | | [[Cores/Decimal16/Encoding#call|Decimal16]] || F0 F1 F2 || 1 |
|} | |} | ||
Line 62: | Line 77: | ||
! [[Cores|Core]] || [[Slot|In Slots]]|| [[Latency|Latencies]] | ! [[Cores|Core]] || [[Slot|In Slots]]|| [[Latency|Latencies]] | ||
|- | |- | ||
− | | [[Cores/Tin/Encoding# | + | | [[Cores/Tin/Encoding#call|Tin]] || F0 || 1 |
|- | |- | ||
− | | [[Cores/Copper/Encoding# | + | | [[Cores/Copper/Encoding#call|Copper]] || F0 F1 || 1 |
|- | |- | ||
− | | [[Cores/Silver/Encoding# | + | | [[Cores/Silver/Encoding#call|Silver]] || F0 F1 F2 || 1 |
|- | |- | ||
− | | [[Cores/Gold/Encoding# | + | | [[Cores/Gold/Encoding#call|Gold]] || F0 F1 F2 F3 || 1 |
|- | |- | ||
− | | [[Cores/Decimal8/Encoding# | + | | [[Cores/Decimal8/Encoding#call|Decimal8]] || F0 F1 F2 || 1 |
|- | |- | ||
− | | [[Cores/Decimal16/Encoding# | + | | [[Cores/Decimal16/Encoding#call|Decimal16]] || F0 F1 F2 || 1 |
|} | |} | ||
+ | |||
+ | |||
+ | [[Instruction_Set|Instruction Set, alphabetical]], [[Instruction Set by Category]], [http://millcomputing.com/instructions.html?collapse=7#ops Instruction Set, sortable, filterable] |
Latest revision as of 17:48, 4 February 2015
The general purpose unconditional abstract call operations you will be using when writing general assembly code. Depending on the number and order of arguments and the number of return values the assembler will pick the most dense from a myriad of different encodings. This can result in gangs or it can be a short single operation. For the gangs, that accommodate more arguments, there are encoding optimizations for common case of zero or one return values.
The call operation is one of the most important novel features of the Mill architecture. It almost completely abolishes calling conventions and function prologues and epilogues and memory references in calls. Almost all arguments are passed via belt the belt, certainly all native data types up to belt size are.
Things like varargs and big structures still need to be passed through the stack.
A call allocates a new belt in a new frame, initialized with the function arguments in order. There can be a different call operation in each slot in an instruction that has a control functional unit if those slots are not used for gangs in call operation with more arguments. Those call operations are chained into each other, the return values being placed into the belt of the next call. This implements a hardware tail call mechanism without involving any memory.
The target of a function call doesn't necessarily have to be a an EBB that implements the function. It can also be a Portal, i.e. it can initiate a full context switch. What the call hardware does depends on the Permissions on the address.
There are also conditional calls. Pretty handy considering how common predicated calls are.
Call operations always have to define how many return values there will be, so the decoder can work with the correct belt indices to route values to the FUs.
related operations: calltr, callfl, retn
call(lit n, p target, args args) → ops r0 ...
An indirect call to a dynamically computed address.
encoding:
call(lit n, p target, off argv, count argc)
encoding:
call(lit n, p target, off argv, count argc, lit argv)
alternate encoding: call0, call1, calln,
Core | In Slots | Latencies |
---|---|---|
Tin | F0 | 1 |
Copper | F0 F1 | 1 |
Silver | F0 F1 F2 | 1 |
Gold | F0 F1 F2 F3 | 1 |
Decimal8 | F0 F1 F2 | 1 |
Decimal16 | F0 F1 F2 | 1 |
call(lit n, lbl target, args args) → ops r0 ...
Function is known at compile time.
encoding:
call(lit n, off target, count argc)
encoding:
call(lit n, off target, count argc, lit argv)
encoding:
call(lit n, off target, count argc, lit argv, lit argv)
alternate encoding: call0, call1, calln,
Core | In Slots | Latencies |
---|---|---|
Tin | F0 | 1 |
Copper | F0 F1 | 1 |
Silver | F0 F1 F2 | 1 |
Gold | F0 F1 F2 F3 | 1 |
Decimal8 | F0 F1 F2 | 1 |
Decimal16 | F0 F1 F2 | 1 |
Instruction Set, alphabetical, Instruction Set by Category, Instruction Set, sortable, filterable