Difference between revisions of "Instruction Set/call"

From Mill Computing Wiki
Jump to: navigation, search
(Created page with "{{DISPLAYTITLE:call}} <div style="font-size:80%;line-height:90%;margin-bottom:2em">realizing  flow stream Decode|flow block...")
 
 
(4 intermediate revisions by the same user not shown)
Line 5:Line 5:
  
 
The general purpose unconditional abstract call operations you will be using when writing [[GenAsm (language)|general assembly code]].
 
The general purpose unconditional abstract call operations you will be using when writing [[GenAsm (language)|general assembly code]].
Depending on the number and order of arguments and the number of return values the assembler will pick the most dense from a myriad of different encodings. This can result in [[Ganging|gangs]] or it can be a short single operation.
+
Depending on the number and order of arguments and the number of return values the assembler will pick the most dense from a myriad of different encodings. This can result in [[Ganging|gangs]] or it can be a short single operation. For the gangs, that accommodate more arguments, there are encoding optimizations for common case of zero or one return values.
 +
 
 +
The call operation is one of the most important novel features of the Mill architecture. It almost completely abolishes calling conventions and function prologues and epilogues and memory references in calls. Almost all arguments are passed via belt the belt, certainly all native data types up to belt size are.<br />
 +
Things like varargs and big structures still need to be passed through the stack.
 +
 
 +
A call allocates a new belt in a new frame, initialized with the function arguments in order.
 +
There can be a different call operation in each slot in an instruction that has a control functional unit if those slots are not used for gangs in call operation with more arguments. Those call operations are chained into each other, the return values being placed into the belt of the next call. This implements a hardware tail call mechanism without involving any memory.
 +
 
 +
The target of a function call doesn't necessarily have to be a an [[EBB]] that implements the function. It can also be a [[Portal]], i.e. it can initiate a full context switch. What the call hardware does depends on the [[Permissions]] on the address.
 +
 
 +
There are also conditional calls. Pretty handy considering how common predicated calls are.
 +
 
 +
Call operations always have to define how many return values there will be, so the decoder can work with the correct belt indices to route values to the [[FU]]s.
 +
 
 +
<b>related operations:</b>  [[Instruction_Set/calltr|calltr]], [[Instruction_Set/callfl|callfl]], [[Instruction_Set/retn|retn]]
 +
 
 
----
 
----
 
<code style="font-size:130%"><b style="color:#050">call</b>(<i><span style="color:#009">[[Immediates#lit|lit]]</span> <span title="morsel-sized manifest constant">n</span></i>, <span style="color:#009">[[Domains#p|p]]</span> <span title="call argument from calls window">target</span>, <span style="color:#009">[[Domains#args|args]]</span> <span title="zero or more call  
 
<code style="font-size:130%"><b style="color:#050">call</b>(<i><span style="color:#009">[[Immediates#lit|lit]]</span> <span title="morsel-sized manifest constant">n</span></i>, <span style="color:#009">[[Domains#p|p]]</span> <span title="call argument from calls window">target</span>, <span style="color:#009">[[Domains#args|args]]</span> <span title="zero or more call  
Line 25:Line 40:
 
! [[Cores|Core]] || [[Slot|In Slots]]|| [[Latency|Latencies]]
 
! [[Cores|Core]] || [[Slot|In Slots]]|| [[Latency|Latencies]]
 
|-
 
|-
| [[Cores/Tin/Encoding#1114|Tin]] || F0 || 1
+
| [[Cores/Tin/Encoding#call|Tin]] || F0 || 1
 
|-
 
|-
| [[Cores/Copper/Encoding#1114|Copper]] || F0 F1 || 1
+
| [[Cores/Copper/Encoding#call|Copper]] || F0 F1 || 1
 
|-
 
|-
| [[Cores/Silver/Encoding#1114|Silver]] || F0 F1 F2 || 1
+
| [[Cores/Silver/Encoding#call|Silver]] || F0 F1 F2 || 1
 
|-
 
|-
| [[Cores/Gold/Encoding#1114|Gold]] || F0 F1 F2 F3 || 1
+
| [[Cores/Gold/Encoding#call|Gold]] || F0 F1 F2 F3 || 1
 
|-
 
|-
| [[Cores/Decimal8/Encoding#1114|Decimal8]] || F0 F1 F2 || 1
+
| [[Cores/Decimal8/Encoding#call|Decimal8]] || F0 F1 F2 || 1
 
|-
 
|-
| [[Cores/Decimal16/Encoding#1114|Decimal16]] || F0 F1 F2 || 1
+
| [[Cores/Decimal16/Encoding#call|Decimal16]] || F0 F1 F2 || 1
 
|}
 
|}
  
Line 62:Line 77:
 
! [[Cores|Core]] || [[Slot|In Slots]]|| [[Latency|Latencies]]
 
! [[Cores|Core]] || [[Slot|In Slots]]|| [[Latency|Latencies]]
 
|-
 
|-
| [[Cores/Tin/Encoding#1115|Tin]] || F0 || 1
+
| [[Cores/Tin/Encoding#call|Tin]] || F0 || 1
 
|-
 
|-
| [[Cores/Copper/Encoding#1115|Copper]] || F0 F1 || 1
+
| [[Cores/Copper/Encoding#call|Copper]] || F0 F1 || 1
 
|-
 
|-
| [[Cores/Silver/Encoding#1115|Silver]] || F0 F1 F2 || 1
+
| [[Cores/Silver/Encoding#call|Silver]] || F0 F1 F2 || 1
 
|-
 
|-
| [[Cores/Gold/Encoding#1115|Gold]] || F0 F1 F2 F3 || 1
+
| [[Cores/Gold/Encoding#call|Gold]] || F0 F1 F2 F3 || 1
 
|-
 
|-
| [[Cores/Decimal8/Encoding#1115|Decimal8]] || F0 F1 F2 || 1
+
| [[Cores/Decimal8/Encoding#call|Decimal8]] || F0 F1 F2 || 1
 
|-
 
|-
| [[Cores/Decimal16/Encoding#1115|Decimal16]] || F0 F1 F2 || 1
+
| [[Cores/Decimal16/Encoding#call|Decimal16]] || F0 F1 F2 || 1
 
|}
 
|}
 +
 +
 +
[[Instruction_Set|Instruction Set, alphabetical]], [[Instruction Set by Category]], [http://millcomputing.com/instructions.html?collapse=7#ops Instruction Set, sortable, filterable]

Latest revision as of 17:48, 4 February 2015

realizing  flow stream  flow block  call phase   operation  

native on: all

The general purpose unconditional abstract call operations you will be using when writing general assembly code. Depending on the number and order of arguments and the number of return values the assembler will pick the most dense from a myriad of different encodings. This can result in gangs or it can be a short single operation. For the gangs, that accommodate more arguments, there are encoding optimizations for common case of zero or one return values.

The call operation is one of the most important novel features of the Mill architecture. It almost completely abolishes calling conventions and function prologues and epilogues and memory references in calls. Almost all arguments are passed via belt the belt, certainly all native data types up to belt size are.
Things like varargs and big structures still need to be passed through the stack.

A call allocates a new belt in a new frame, initialized with the function arguments in order. There can be a different call operation in each slot in an instruction that has a control functional unit if those slots are not used for gangs in call operation with more arguments. Those call operations are chained into each other, the return values being placed into the belt of the next call. This implements a hardware tail call mechanism without involving any memory.

The target of a function call doesn't necessarily have to be a an EBB that implements the function. It can also be a Portal, i.e. it can initiate a full context switch. What the call hardware does depends on the Permissions on the address.

There are also conditional calls. Pretty handy considering how common predicated calls are.

Call operations always have to define how many return values there will be, so the decoder can work with the correct belt indices to route values to the FUs.

related operations: calltr, callfl, retn


call(lit n, p target, args args) → ops r0 ...

operands: like Inv :

An indirect call to a dynamically computed address.

encoding: call(lit n, p target, off argv, count argc)
encoding: call(lit n, p target, off argv, count argc, lit argv)
alternate encoding: call0, call1, calln,

Core In Slots Latencies
Tin F0 1
Copper F0 F1 1
Silver F0 F1 F2 1
Gold F0 F1 F2 F3 1
Decimal8 F0 F1 F2 1
Decimal16 F0 F1 F2 1

call(lit n, lbl target, args args) → ops r0 ...

operands: like Inv :

Function is known at compile time.

encoding: call(lit n, off target, count argc)
encoding: call(lit n, off target, count argc, lit argv)
encoding: call(lit n, off target, count argc, lit argv, lit argv)
alternate encoding: call0, call1, calln,

Core In Slots Latencies
Tin F0 1
Copper F0 F1 1
Silver F0 F1 F2 1
Gold F0 F1 F2 F3 1
Decimal8 F0 F1 F2 1
Decimal16 F0 F1 F2 1


Instruction Set, alphabetical, Instruction Set by Category, Instruction Set, sortable, filterable