“It’s also used for bit-packing with shifts, but of course is limited to 1-bit sizes when relying on Carry.”
There are bit test/set/clear/flip ops for both dynamic and immediate bit numbers. You no longer need to use shifts.
“Marshaling and [de]serialization is one aspect that’s always a thorn in the side of execution speed, so being able to directly shift various N-bit-wide values in & out of wider numbers with under/overflow detection would be nice. Of course, if the individual steps in such an operation can be automatically executed in parallel, then the need for more powerful instructions goes away.”
A full funnel shift is complicated on any machine; we have worked on it, but are not yet satisfied with the general case of parsing a stream of varying-length items. For static fields (such as bit-fields in C) there is a merge operation that is a bitwise “pick” (take a bit from src0 or src1 based on the corresponding bit in a mask src2), which replaces (a&m)|(b& ~m). However, it’s not clear whether a compiler will recognize that idiom. We have gone back and forth on whether there should be ops for field extract (otherwise it’s two shifts) and field insert (otherwise rotate/mask/or/rotate). The problem is that there can be at most two belt inputs to an operation per slot.
“There are quite a few ABIs that use Carry as an in/out boolean flag, or side-band error return indicator. This is, of course, a result of registers being “expensive”, due to fixed register files. If belt slots are “cheaper”, then those can become normal parameters. However, there still is pressure on spilling that comes with this.”
Those ABIs assume one operation at a time execution. When there could be half-a-dozen ops trying to use that same carry flag… 🙂
“One interesting scenario that comes to mind is the SBCL compiler for Common Lisp. When calling a function in Lisp, it is unknown how many values it will return. In SBCL x86-64, a single valued return returns with carry clear. For multi-valued return, carry is set and the count of returned values is held in RCX (which is a scratch register otherwise).”
For the Mill I’d guess you would simply return the count as a second result in all cases. Is the value returned (when more than one) a list, so the flag is really saying whether thre resulting list is a primitive of a collection?
“While it’s probably a more general question than metadata, how would you address older values on the belt after a function call which returns a variable/unknown number of return values?”
In general you cannot do so; you would have to run everything you wanted to save off to scratchpad, and even that wouldn’t work because you would have no way to know how many results you got if you wanted to save them.
I confess my ignorance of Lisp implementations; how does the received of multiple results discover that happened, and what would it want to do with them when it does? I suspect that the Mill would handle this at a higher level, but I need to understand the use-case rather than the conventional implementation for that case.