There’s no restriction to two results, although we have none more than two except the function return op itself, which can completely fill the belt if it wants.
FUs that have ops that can return more than one result have an output result register for each result. These bump up the latency chain normally if there are the same number of registers at the next higher latency. If there are not (which is common) it’s up to the hardware guys whether to add dummy result registers at the higher latencies to bump into (simpler, but more space and more belt sources) or bump the excess direct into the spiller buffer pool (more spiller sources). The software can’t tell.