I’m not sure I’d consider a RNG a ‘very-long-latency’ operation. Suitably pipelined it becomes a generating function producing a value on each tick at the expense of area. Conditioning from an entropy source is an algorithm though, so I see where you are coming from. If you have the budget I’d urge at least a PRNG I think it would be of general value.
In the case of Monte Carlo I see a potential improvement from the temporal addressing provided by the belt. For maximum impact the trial state space must fit on the belt and there would need to be a local source of randomness. If the inner loop reverts to memory access it’s no better than streamed vector approach using GPU/DSP.