Well, you have floating-point condition codes; is there a reason those bits can’t be output by the integer ALU as well? At least for “large-enough” word sizes, so you don’t need a carry bit per byte. Doing it using 2-input adds is a bit awkward (think about -2+1+carry and -1+1+carry), but an “add operand 2’s carries to operand 1’s data, and OR (or add) the resultant carries with operand 1’s carries” instruction would do the trick, I think.
I’d output carry and signed overflow bits, and handle overflow saturation in a separate instruction. You could also do a separate “divide by 2” instruction to compute averages without overflow if that was easier than another add opcode.