Difference between revisions of "Metadata"
(spelling) | m | ||
Line 3: | Line 3: | ||
This metadata is not restricted to the [[Belt]], but is preserved in the [[Scratchpad]] and carries through the result registers in the [[Slot]]s and so forth. | This metadata is not restricted to the [[Belt]], but is preserved in the [[Scratchpad]] and carries through the result registers in the [[Slot]]s and so forth. | ||
− | While the store operation looks into the metadata to know how much to store, it strips the value of it. No metadata in the caches and in memory. Loads on the other hand initialize the metadata together with the value, since load operations have the basic metadata type tags hard-coded into them. | + | While the store operation looks into the metadata to know how much to store, it strips the value of it. No metadata exists in the caches and in memory. Loads, on the other hand, initialize the metadata together with the value, since load operations have the basic metadata type tags hard-coded into them. |
== The Metadata Fields == | == The Metadata Fields == | ||
Line 17: | Line 17: | ||
=== Width === | === Width === | ||
− | Every operand value is tagged with its byte width, i.e. 1, 2, 4, 8, 16. The width doesn't say anything about the interpretation of the bits. An 8 byte value can serve as input for signed and unsigned integer arithmetic, for double float operations and for pointer arithmetic. How | + | Every operand value is tagged with its byte width, i.e. 1, 2, 4, 8, 16. The width doesn't say anything about the interpretation of the bits. An 8 byte value can serve as input for signed and unsigned integer arithmetic, for double float operations and for pointer arithmetic. How successfully those operations work out depends on the specific bit pattern and the operation semantics. <abbr title="Not a Number">NaN</abbr> bit patterns can cause [[Fault]]s with [http://en.wikipedia.org/wiki/IEEE_754 IEEE 754] operations and work perfectly fine as integers. |
There are [[Instruction Set/Narrow|narrow]] and [[Instruction Set/Narrow|widen]] instructions to change the width of an operand. | There are [[Instruction Set/Narrow|narrow]] and [[Instruction Set/Narrow|widen]] instructions to change the width of an operand. | ||
Line 25: | Line 25: | ||
Every operand, and every element in a SIMD slice, has a bit that determines whether a value is valid or not. When the actual value content of the operand is zero, this is a None value, an invalid operand that just gets ignored and/or propagated by any operation performed on it. In fact whenever a new belt is created for a new frame, this is the value all belt positions are initialized to by default. | Every operand, and every element in a SIMD slice, has a bit that determines whether a value is valid or not. When the actual value content of the operand is zero, this is a None value, an invalid operand that just gets ignored and/or propagated by any operation performed on it. In fact whenever a new belt is created for a new frame, this is the value all belt positions are initialized to by default. | ||
− | When the operand value is something else from zero it is a <abbr title="Not a Result">NaR</abbr> value | + | When the operand value is something else from zero it is a <abbr title="Not a Result">NaR</abbr> value and also an invalid value that gets propagated by any operation performed on it. But some operations raise a fault when they encounter a NaR. |
− | Nones and NaRs come in very handy for [[Speculation]] and [[Debugger|Debugging]] | + | Nones and NaRs come in very handy for [[Speculation]] and [[Debugger|Debugging]]. More on that there. |
=== [http://en.wikipedia.org/wiki/IEEE_754 IEEE 754] Floating Point Flags === | === [http://en.wikipedia.org/wiki/IEEE_754 IEEE 754] Floating Point Flags === | ||
− | The overflow and underflow and rounding behavior of floating point operations is | + | The overflow and underflow and rounding behavior of floating point operations is captured in a number of flags. On conventional processors those tend to be global state flags. On the Mill that wouldn't work because global state flags introduce unnecessary data dependencies and prevent speculation. For this reason, every operand carries its own complete set floating point state flags: |
* <u>d</u>ivide by zero | * <u>d</u>ivide by zero | ||
Line 39: | Line 39: | ||
* <u>o</u>verflow | * <u>o</u>verflow | ||
− | As all metadata those bits are propagated through the functional units and all internal data flow whenever they occur. When they are set in operands they are ored together in results and propagated further with the results. Only on [[Phasing#Realization|realization]], like stores, they are written into global state and trigger any of the possible [[Interrupts]]. | + | As all metadata those bits are propagated through the functional units and all internal data flow whenever they occur. When they are set in operands, they are ored together in results and propagated further with the results. Only on [[Phasing#Realization|realization]], like stores, they are written into global state and trigger any of the possible [[Interrupts]]. |
== Rationale == | == Rationale == | ||
− | Overall the effect of metadata can be described as making everything smoother and more regular, and as such easier to reason about. | + | Overall the effect of metadata can be described as making everything smoother and more regular, and as such easier to reason about. It curbs bloat, bloat meaning unnecessary complexity. |
The width and scalarity metadata tags massively reduce the bloat in the instruction set and make it not only denser and more effective, but also a lot more regular and logical. It even helps code reuse by introducing a form of polymorphism on the binary level. | The width and scalarity metadata tags massively reduce the bloat in the instruction set and make it not only denser and more effective, but also a lot more regular and logical. It even helps code reuse by introducing a form of polymorphism on the binary level. |
Revision as of 06:09, 3 January 2015
All operands on the belt, additionally to the actual byte pattern that makes up the value, carry around a few bits of metadata that inform and augment how the vast majority of operations on those operands work.
This metadata is not restricted to the Belt, but is preserved in the Scratchpad and carries through the result registers in the Slots and so forth.
While the store operation looks into the metadata to know how much to store, it strips the value of it. No metadata exists in the caches and in memory. Loads, on the other hand, initialize the metadata together with the value, since load operations have the basic metadata type tags hard-coded into them.
Contents
The Metadata Fields
Scalarity
Any operand can be a SIMD vector, or slice for short, usually with 4 elements. Although the SIMD element count depends on the Specification of the processor. It can be more. The width of these elements can be any of the available widths of scalars on the processor.
Most operations are polymorphic over the Width and Scalarity tags, i.e. the same opcode performs 8bit to 128bit integer arithmetic for scalars and vectors depending on the metadata of the input operands.
Narrow and widen work for slices too, although widen produces 2 outputs with doubled width elements to avoid overflows for the maximum widths.
Width
Every operand value is tagged with its byte width, i.e. 1, 2, 4, 8, 16. The width doesn't say anything about the interpretation of the bits. An 8 byte value can serve as input for signed and unsigned integer arithmetic, for double float operations and for pointer arithmetic. How successfully those operations work out depends on the specific bit pattern and the operation semantics. NaN bit patterns can cause Faults with IEEE 754 operations and work perfectly fine as integers.
There are narrow and widen instructions to change the width of an operand.
None and NaR
Every operand, and every element in a SIMD slice, has a bit that determines whether a value is valid or not. When the actual value content of the operand is zero, this is a None value, an invalid operand that just gets ignored and/or propagated by any operation performed on it. In fact whenever a new belt is created for a new frame, this is the value all belt positions are initialized to by default.
When the operand value is something else from zero it is a NaR value and also an invalid value that gets propagated by any operation performed on it. But some operations raise a fault when they encounter a NaR.
Nones and NaRs come in very handy for Speculation and Debugging. More on that there.
IEEE 754 Floating Point Flags
The overflow and underflow and rounding behavior of floating point operations is captured in a number of flags. On conventional processors those tend to be global state flags. On the Mill that wouldn't work because global state flags introduce unnecessary data dependencies and prevent speculation. For this reason, every operand carries its own complete set floating point state flags:
- divide by zero
- inexact
- invalid
- underflow
- overflow
As all metadata those bits are propagated through the functional units and all internal data flow whenever they occur. When they are set in operands, they are ored together in results and propagated further with the results. Only on realization, like stores, they are written into global state and trigger any of the possible Interrupts.
Rationale
Overall the effect of metadata can be described as making everything smoother and more regular, and as such easier to reason about. It curbs bloat, bloat meaning unnecessary complexity.
The width and scalarity metadata tags massively reduce the bloat in the instruction set and make it not only denser and more effective, but also a lot more regular and logical. It even helps code reuse by introducing a form of polymorphism on the binary level.
The NaR bit in its two interpretations as an ignored None and an eventual fault eliminates the need for a lot of special and corner case code and opens up untapped reservoirs of instruction level parallelism in doing so. It makes speculative execution simple and straightforward and much more applicable. It enables efficient software Pipelining of most loops. The most limiting factor of ILP tends to be control flow, and both speculation and software pipelining together with Phasing vastly expand the windows for looking for such opportunities, across control flow borders.
Those are the main reasons for expending those few extra bits in the core, the huge payback in reduced complexity in a lot of other crucial areas. There are quite a few auxiliary benefits too, as in Debugging.