Something that was not addressed in this talk is virtual and indirect calls/jumps. Predicting these has significant effect not just with dynamic languages but as low-level as C++.
But often the type varies for a given call site. Unlike branch prediction, it’s not limited to a binary decision between two potential addresses. Opportunity to prefetch the next EBB can come from hoisting the load of the dispatch address, which can go pretty far in functions which perform processing themselves before the dispatch call. However, the software would need to be able to give this address to the exit chaining mechanism in order to gain value from this prefetch.
There are also concepts of history tables linked to object identity instead of call site address. With these, multiple call sites could benefit from a single knowledge update, but I don’t think they’re as appropriate for general purpose CPUs as they’re usually specific to object system ABIs.