Tail calls are — interesting — for us. Because security, our control stack (return addresses, spilled surrounding state) is not addressable by user code and is not in a simple stack model even if it were. We also support calls across protection domains. Consequently an app-mode trampoline is unable to diddle the state of the stack; it can only touch actual data.
However, the ISA does have one feature that should simplify simple tail calls: the branch instruction can take arguments that replace the pre-branch belt content. Unfortunately, call arguments may be too-big or too-many to fit all on the belt (and then there’s VARARGS). These go in the frame – but the frame itself must be explicitly allocated with a specific instruction that sets all the internal (and unaddressable) registers; you can’t just do a push. Because security, again.
Another issue is that a callee can assume that caller state is spilled and later restored in full: everything is in effect caller-saved. Thus it will assume that it has a full scratch and retire station complement available. If it doesn’t then it might unexpectedly run out of resources when tail-called by a branch that does not do the resource management that a call does.
When we have looked into this, we rather quickly concluded that the expected fix-the-stack-and-branch approach was hopeless. Instead we will need to define a “tailcall” hardware operation that does all the internals like a call but reuses (and extends) the pre-existing frame in a very Mill-specific way. Details TBD.