The role of branch pradiction is both to hide two things CPU pipeline latency, and memory fetch latency. With your proposal of inlined instructions, we still need a predictor to chose between the inlined and normal flow of instructions so it doesn’t help with CPU latency. (If your proposing to split the CPU power, then this is just equivalent to instruction hoisting, something the mill is already very well set up to do statically) It does help with the memory flow, but at that point we are talking 10s of instructions to match L2 load latency, and thats a lot of bandwidth to eat up. And in the case of indirect jumps, which instructions would you inline.
Reply To: Prediction tbickerstaff