Difference between revisions of "Scratchpad"

From Mill Computing Wiki
Jump to: navigation, search
(Created page with "The scratchpad is an on chip overflow buffer for values that need to be readily accessible, but would fall off the end of the Belt. As has been described in the Belt...")
 
m (small corrections)
Line 1:Line 1:
 
The scratchpad is an on chip overflow buffer for values that need to be readily accessible, but would fall off the end of the [[Belt]].
 
The scratchpad is an on chip overflow buffer for values that need to be readily accessible, but would fall off the end of the [[Belt]].
  
As has been described in the [[Belt]] page, values on the belt are immutable and fall off the end as new values are added to the front. Not always, i.e. in 5-10% of the cases the value produced in some operation can be immediately used as an operand for the next operation, or sometimes it needs to be used again in the near future.<br />
+
As has been described in the [[Belt]] page, values on the belt are immutable and fall off the end as new values are added to the front. Not always, i.e. in 5-10% of the cases the value produced in some operation can ''not'' be immediately used as an operand for the next operation, or sometimes it needs to be used again in the near future.<br />
 
In that situation there are several things you can do:<br />
 
In that situation there are several things you can do:<br />
You can do a normal store to memory, going through the cache hierarchy and then do a load again later, but this is kind of a waste, potentially very slow and is one of the major sources of cache pollution in on traditional processory. It also loses all [[Metadata] information of the value.<br />
+
You can do a normal store to memory, going through the cache hierarchy and then do a load again later, but this is kind of a waste, potentially very slow and is one of the major sources of cache pollution in on traditional processory. It also loses all [[Metadata]] information of the value.<br />
 
You could just move the value to the front of the belt again to give it some additional life, when it really is just a short delay. In fact such a facility exists with [[Instruction Set/Promote| promote operation]].<br />
 
You could just move the value to the front of the belt again to give it some additional life, when it really is just a short delay. In fact such a facility exists with [[Instruction Set/Promote| promote operation]].<br />
 
Or you could cache the value for a brief time in a dedicated buffer. And this buffer is called Scratchpad.
 
Or you could cache the value for a brief time in a dedicated buffer. And this buffer is called Scratchpad.
  
Just as the number of belt locations the size of the scratchpad is [[Specification]] dependent. It is used purely for temporary [[Frame]] local values. In fact the [[Specializer] must explicitely reserve the amount of store positions needed by every frame. If the size of the scratchpad is exceeded anyway during operation, the [[Spiller]] transparently manages the shuffling of those values into the spill buffer and eventually into system memrory, into the specially protected spiller [[Turf]] if needed. This doesn't pollute the caches and preserves value [[Metadata]]. It is also much more efficient than the normal caches, since the lifetime and usage of those values is normally comletely and statically determined at compile time. The normal caches are built to deal with more unpredictable access patterns.
+
Just as the number of belt locations the size of the scratchpad is [[Specification]] dependent. It is used purely for temporary [[Frame]] local values. In fact the [[Specializer]] must explicitely reserve the amount of store positions needed by every frame. If the size of the scratchpad is exceeded anyway during operation, the [[Spiller]] transparently manages the shuffling of those values into the spill buffer and eventually into system memrory, into the specially protected spiller [[Turf]] if needed. This doesn't pollute the caches and preserves value [[Metadata]]. It is also much more efficient than the normal caches, since the lifetime and usage of those values is normally comletely and statically determined at compile time. The normal caches are built to deal with more unpredictable access patterns.
The operation put a value into the Scratchpad is called [[Instruction Set|spill]], and the instruction to retrieve it again is called [[Instruction Set|fill]].
+
The operation for putting a value into the Scratchpad is called [[Instruction Set|spill]], and the instruction to retrieve it again is called [[Instruction Set|fill]].
  
All three operations mentioned here are on the [[Flow]] side of processing [[Slot]]s, and they are also only known to the [[Specializer]] for the final machine code. The compilers have no use for it, since they don't know the final machine configurations the code will run on, and these operation only make sense and can be used at all if you have and know the exact hardware limits. It must be the specializer that decides on a case by case basis whether to promote or to spill and fill a value. The case for normal load and store is more clear cut, those are more permanent or more complex data structures.
+
All three operations mentioned here are on the [[Flow]] side of processing [[Slot|slots]], and they are also only known to the [[specializer]] for the final machine code. The compilers have no use for it, since they don't know the final machine configurations the code will run on, and these operation only make sense and can be used at all if you have and know the exact hardware limits. It must be the specializer that decides on a case by case basis whether to promote or to spill and fill a value. The case for normal load and store is more clear cut, those are more permanent or more complex data structures.
  
 
This was one of the main motivation beind creating the scratchpad, too: to make most memory accesses for temporary data obsolete.
 
This was one of the main motivation beind creating the scratchpad, too: to make most memory accesses for temporary data obsolete.

Revision as of 18:14, 5 June 2015

The scratchpad is an on chip overflow buffer for values that need to be readily accessible, but would fall off the end of the Belt.

As has been described in the Belt page, values on the belt are immutable and fall off the end as new values are added to the front. Not always, i.e. in 5-10% of the cases the value produced in some operation can not be immediately used as an operand for the next operation, or sometimes it needs to be used again in the near future.
In that situation there are several things you can do:
You can do a normal store to memory, going through the cache hierarchy and then do a load again later, but this is kind of a waste, potentially very slow and is one of the major sources of cache pollution in on traditional processory. It also loses all Metadata information of the value.
You could just move the value to the front of the belt again to give it some additional life, when it really is just a short delay. In fact such a facility exists with promote operation.
Or you could cache the value for a brief time in a dedicated buffer. And this buffer is called Scratchpad.

Just as the number of belt locations the size of the scratchpad is Specification dependent. It is used purely for temporary Frame local values. In fact the Specializer must explicitely reserve the amount of store positions needed by every frame. If the size of the scratchpad is exceeded anyway during operation, the Spiller transparently manages the shuffling of those values into the spill buffer and eventually into system memrory, into the specially protected spiller Turf if needed. This doesn't pollute the caches and preserves value Metadata. It is also much more efficient than the normal caches, since the lifetime and usage of those values is normally comletely and statically determined at compile time. The normal caches are built to deal with more unpredictable access patterns. The operation for putting a value into the Scratchpad is called spill, and the instruction to retrieve it again is called fill.

All three operations mentioned here are on the Flow side of processing slots, and they are also only known to the specializer for the final machine code. The compilers have no use for it, since they don't know the final machine configurations the code will run on, and these operation only make sense and can be used at all if you have and know the exact hardware limits. It must be the specializer that decides on a case by case basis whether to promote or to spill and fill a value. The case for normal load and store is more clear cut, those are more permanent or more complex data structures.

This was one of the main motivation beind creating the scratchpad, too: to make most memory accesses for temporary data obsolete.