There is an alloca operation in the Mill ISA, used for both alloca() and VLAs. It changes the internal specRegs to dynamically allocate space in the current stacklet, which space implicitly deallocates on function return. There are some issues that must be addressed by the implementation:
what happens when there is more than one alloca in an instruction bundle?
The order of effect is not defined, nor is the order between bundles in the same EBB. The only way the order could make a difference is if the program compared the addresses returned by the two alloca invokes, which comparison is illegal per the C and C++ standards.
What happens if the requested allocation does not fit in the current partially occupied stacklet?
This is handled the same way that stacklet overflow for ordinary locals is handled: the hardware (directly or via a trap, depending on the member implementation) dynamically allocates a new stacklet in the address space, allocates the desired range, makes it visible in the PLB to the current turf, and arranges that the stack cutback at return will reverse the allocation. There is some magic that prevent excessive cost if the allocation repeatedly crosses a stacklet with an iterative allocate/deallocate sequence; NYF.
What happens if the allocation is too large to fit in an empty standard stacklet?
There has been some sentiment in favor of simply banning (and faulting) this usage; after all, other systems do have upper bounds on the size of alloca’s. However, currently the ISA defines that this condition will be supported by the hardware, typically via traps to system software for both allocation and deallocation. The upper bound is defined by the application’s memory quantum, as with other allocations of raw address space like mmap().