A “one-shot” drop is an interesting idea we have looked at. It would not actually simplify the spiller, which would still need to be able to handle cases in which all drops were multi-shot. The savings would come in power, and to a lesser extent in spiller bandwidth in its paths to memory. The cost is entropy in the encoding and complexity of the implementation.
It is clear that bit per drop encoding has the same functionality as an op with a bitmask. Roughly 65% of ops (dynamic) have drops, while multidrop ops are rare enough to ignore. Consequently bit-per-drop costs ~0.65 bit per op. If you figure IPC of 6 for a mid-range Mill, the overall entropy would be ~4 bits per instruction. In comparison, an op and a bit mask in complete generality (like rescue) would need a belt’s worth of bits (16 for midrange), plus the entropy of the op itself, say 9 to 15 bits depending on slot load. It’s clear that bit-per-drop has the better entropy, even before considerations of slot pressure.
The separate op approach is also somewhat redundant with the rescue op. An unrescued belt entry (i.e. one that occupied a position when the rescue executed but wasn’t mentioned by the rescue) doesn’t get spilled; it’s just marked as invalid and a reference to it will fault. The same applies to branches that reconfigure the belt. A oneShot op says which operands will be used once before they are used, while the rescue op says which drops were one-shots after they were used. We don’t currently use rescue just to winnow dead values from the belt, but we could if power measurements showed it would be worth while.