Thank you for your swift reply.
You were very close to the mark on addressing what I meant to ask (except for the waterfall part).
My primary points were handling multiple values at once and the sector/page/segment/block, …
(give it a name) addressing to be able to be compact with respect to details.
It assumes there will be address locality similarities to exploit, but this is just an assumption on my part. It mainly came up when there was talk about the size of the scratchpad. Bigger size means harder time to encode and any locality if you will could be exploited to lessen the negative side of increased size.
As a slight variation on the multiple values in one go theme, would a “touch” like operation that specifies multiple belt positions (maybe just in the last N positions) that are needed soon again help in any way? Say the belt has 32 positions, and the compiler knows some of the last 8 values will be needed shortly again, explicitly copying them as fresh belt values might be more compact than explicitly saving/restoring them elsewhere. It would be compiler controlled removal of “junk” from the belt by duplicating good values as new. Conceptually a belt could be increased in size with only this operation being able to operate on the extended part and nowhere else. It might even allow for a smaller directly addressed belt in the process, saving bits in every operand encoding. I probably should read up on the existing instructions to understand some more, I am currently shooting a bit from the hip here.
Also, never having worked with an architecture that exposed a scratchpad to me, I am wondering how its typically used. If it would be for constants that are needed a few times during execution of code, I imagine normal memory addressing and the cache system would work just fine. Is there a typical cutoff point where the scratchpad starts to benefit and at what sizes?