Say frame 17 calls frame 18, which does a MULF but returns to 17 before the MULF finishes. When the result pops out of the multiplier it is labeled as belonging to 18, known to be exited, and is discarded.
If 18 does a tail-call instead of a return, the called function just replaces frame 18, it doesn’t add a new frame 19 the way a regular call would. That way it can use the return linkages of 18 when it finally returns back to 17. But now the MULF comes out looking like it belongs to the current 18 (the tail-called function), but it doesn’t. Oops.
The structures to solve this are obvious, but add cost to everything, not just tail calls.