Mill Computing, Inc. Forums The Mill Architecture Memory level parallelism and HW scouts Reply To: Memory level parallelism and HW scouts

Findecanor
Participant
Post count: 35

To the question of OoO vs. The Mill’s deferred loads: I’d wait with any judgement until I’ve seen benchmarks.
I completely expect The Mill to be worse at some workloads but better at others. With statically scheduled code, performance relies a lot on the compiler.

GPUs and CPUs are now so far apart that they are used for completely different workloads. Only some algorithms can be data-parallelised. AI workloads are also moving away from GPUs to dedicated low-precision matmul cores.
Therefore I think that comparing them that way is a bit silly.

We’re also entering the many-core era for CPUs. You can get a 192-core system today, with kilo-core systems on the way. I expect memory stalls to be more common on those than anywhere else.
Those are found mostly in servers: where overall throughput of general-purpose code is important, as is power consumption.
Again, this is an area where I’d await benchmarks.

But being a data-flow processor with deferred loads is only one area where The Mill is different from conventional architectures.