Ivan Godard
Post count: 627

Rotate/swap ops are not sufficient to replace spill/fill because the number of long-lived live operands in the queue/belt is unbounded. The length of the belt in individual members is determined by the scratch-pad spill-to-fill latency (three cycles on most members) and the rate at which the core can produce results, which is roughly equal to the number of execution pipelines, running from ~5 to over 30 depending on family member. As a rule of thumb we set the length to hold three cycles of production, and tune from there.