Neither decode0 nor decode 1 “produce” anything except whole or party decoded operations.
Internally to the team, the phase between decode 1 and execute 0 is variously called decode 2, issue 0, or execute -1 depending on which aspect of it you are interested in. For external documentation we try to call it the issue stage, but there is still the potential for confusion because there are three issue stages because of phasing.
The phase sequence continues, with overlap, across all control flow boundaries, following the predicted path. It is entirely possible for a cycle’s reader, compute, and writer ops to belong to three different functions. A mispredict is a pain getting the bogus state discarded and things restarted. Big OOO machines have a separate retire stage at the end, with a reorder buffer, to make the update to the architected state be in the proper sequence; if they mispredict they throw away everything that is in flight and re-execute the instructions; this is issue replay. The Mill takes a different approach, marking each in-flight with the cycle that it issued in. At a mispredict we let the in-flights drain to the spiller buffers, the same way we do for in-flights over a call or interrupt, discarding those marked as being from the cycles down the wrong path.
Meanwhile we are restarting the decoder, and as the new ops start execution we replay the former in-flights just as we do after a return. This is result replay.
In summary: phasing proceeds in the presence of control flow just as it would have had all the control flow been inlined.