Difference between revisions of "Metadata"

From Mill Computing Wiki
Jump to: navigation, search
(Scalarity)
m (fix links to narrow & widen)
 
(9 intermediate revisions by 2 users not shown)
Line 3:Line 3:
 
This metadata is not restricted to the [[Belt]], but is preserved in the [[Scratchpad]] and carries through the result registers in the [[Slot]]s and so forth.
 
This metadata is not restricted to the [[Belt]], but is preserved in the [[Scratchpad]] and carries through the result registers in the [[Slot]]s and so forth.
  
While the store operation looks into the metadata to know how much to store, it strips the value of it. No metadata in the caches and in memory. Loads on the other hand intialize the metadata together with the value, since load operations have the basic metadata type tags hardcoded into them.
+
While the store operation looks into the metadata to know how much to store, it strips the value of it. No metadata exists in the caches and in memory. Loads, on the other hand, initialize the metadata together with the value, since load operations have the basic metadata type tags hard-coded into them.
  
 
== The Metadata Fields ==
 
== The Metadata Fields ==
 
=== Width ===
 
 
Every operand value is tagged with its byte width, i.e. 1, 2, 4, 8, 16. The width doesn't say anything about the interpretation of the bits. An 8 byte value can serve as input for signed and unsigned interger arithmetics, for double float operations and for pointer arithmetics. How successful those operations work out of course depends on the specific bitpattern and the operation semantics. <abbr title="Not a Number">NaN</abbr> bitpatterns can cause [[Fault]]s with [http://en.wikipedia.org/wiki/IEEE_754 IEEE 754] operations and work perfectly fine as integers.
 
 
There are [[Instruction Set/Narrow|narrow]] and [[Instruction Set/Narrow|widen]] instructions to change the width of an operand.
 
  
 
=== Scalarity ===
 
=== Scalarity ===
Line 17:Line 11:
 
Any operand can be a SIMD vector, or slice for short, usually with 4 elements. Although the SIMD element count depends on the [[Specification]] of the processor. It can be more. The width of these elements can be any of the available widths of scalars on the processor.
 
Any operand can be a SIMD vector, or slice for short, usually with 4 elements. Although the SIMD element count depends on the [[Specification]] of the processor. It can be more. The width of these elements can be any of the available widths of scalars on the processor.
  
Most operations are polymorphic over the Width and Scalarity tags, i.e. the same opcode performs 8bit to 128bit integer arithmetics for scalars and vectors depending on the metadata of the input operands.
+
Most operations are polymorphic over the Width and Scalarity tags, i.e. the same opcode performs 8bit to 128bit integer arithmetic for scalars and vectors depending on the metadata of the input operands.
  
 
Narrow and widen work for slices too, although widen produces 2 outputs with doubled width elements to avoid overflows for the maximum widths.
 
Narrow and widen work for slices too, although widen produces 2 outputs with doubled width elements to avoid overflows for the maximum widths.
 +
 +
=== Width ===
 +
 +
Every operand value is tagged with its byte width, i.e. 1, 2, 4, 8, 16. The width doesn't say anything about the interpretation of the bits. An 8 byte value can serve as input for signed and unsigned integer arithmetic, for double float operations and for pointer arithmetic. How successfully those operations work out depends on the specific bit pattern and the operation semantics. <abbr title="Not a Number">NaN</abbr> bit patterns can cause [[Fault]]s with [http://en.wikipedia.org/wiki/IEEE_754 IEEE 754] operations and work perfectly fine as integers.
 +
 +
There are [[Instruction_Set/narrow|narrow]] and [[Instruction_Set/widen|widen]] instructions to change the width of an operand.
  
 
=== None and <abbr title="Not a Result">NaR</abbr> ===
 
=== None and <abbr title="Not a Result">NaR</abbr> ===
  
Every operand, and every element in a SIMD slice, has a bit that determines whether a value is valid or not. When the actual value content of the operand is zero, this is a None value, an invalid operand that just gets ignored and/or propagated by any operation performed on it. In fact whenever a new belt is created for a new frame, this is the value all belt positions are initalized to by default.
+
Every operand, and every element in a SIMD slice, has a bit that determines whether a value is valid or not. When the actual value content of the operand is zero, this is a None value, an invalid operand that just gets ignored and/or propagated by any operation performed on it. In fact whenever a new belt is created for a new frame, this is the value all belt positions are initialized to by default.
  
When the operand value is something else from zero it is a <abbr title="Not a Result">NaR</abbr> value. Also an invalid value that gets propagated by any operation performed on it. But some operations, when they encounter a NaR raise a fault.
+
When the operand value is something else from zero it is a <abbr title="Not a Result">NaR</abbr> value and also an invalid value that gets propagated by any operation performed on it. But some operations raise a fault when they encounter a NaR.
  
Nones and NaRs come in very handy for [[Speculation]] and [[Debugger|Debugging]], more on that there.
+
Nones and NaRs come in very handy for [[Speculation]] and [[Debugger|Debugging]]. More on that there.
  
 
=== [http://en.wikipedia.org/wiki/IEEE_754 IEEE 754] Floating Point Flags ===
 
=== [http://en.wikipedia.org/wiki/IEEE_754 IEEE 754] Floating Point Flags ===
  
The overflow and underflow and rounding behaviour of floating point operations is captures in a neumber of flags. On conventional processors those tend to be global state flags. On the Mill this wouldn't work because global state flags introduce unnecessary data dependencies and prevent speculation. For this reason every operand carries its own state flags.
+
The overflow and underflow and rounding behavior of floating point operations is captured in a number of flags. On conventional processors those tend to be global state flags. On the Mill that wouldn't work because global state flags introduce unnecessary data dependencies and prevent speculation. For this reason, every operand carries its own complete set floating point state flags:
 +
 
 +
* <u>d</u>ivide by zero
 +
* ine<u>x</u>act
 +
* in<u>v</u>alid
 +
* <u>u</u>nderflow
 +
* <u>o</u>verflow
 +
 
 +
As all metadata those bits are propagated through the functional units and all internal data flow whenever they occur. When they are set in operands, they are ored together in results and propagated further with the results. Only on [[Phasing#Realization|realization]], like stores, they are written into global state and trigger any of the possible [[Interrupts]].
 +
 
  
 
== Rationale ==
 
== Rationale ==
  
The two main tasks of metadata are to vastly improve code density by giving operands types they carry around with them and to enable aggressive speculative execution of branches. It also helps a lot in debugging.
+
Overall the effect of metadata can be described as making everything smoother and more regular, and as such easier to reason about. It curbs bloat, meaning unnecessary complexity.
 +
 
 +
The width and scalarity metadata tags massively reduce the bloat in the instruction set and make it not only denser and more effective, but also a lot more regular and logical. It even helps code reuse by introducing a form of polymorphism on the binary level.
 +
 
 +
The <abbr title="Not a Result">NaR</abbr> bit in its two interpretations as an ignored None and an eventual fault eliminates the need for a lot of special and corner case code and opens up untapped reservoirs of instruction level parallelism in doing so. It makes [[Speculation|speculative execution]] simple and straightforward and much more applicable. It enables efficient software [[Pipelining]] of most loops. The most limiting factor of <abbr title="Instruction Level Parallelism">ILP</abbr> tends to be control flow, and both speculation and software pipelining together with [[Phasing]] vastly expand the windows for looking for such opportunities, across control flow borders.
 +
 
 +
Those are the main reasons for expending those few extra bits in the core, the huge payback in reduced complexity in a lot of other crucial areas. There are quite a few auxiliary benefits too, as in [[Debugging]].
  
A few extra bits in the central data paths are cheap compared to the complex functional hardware and instruction set creep on conventional machines that provide the same features.
+
== Media ==
 +
[http://www.youtube.com/watch?v=DZ8HN9Cnjhc Presentation on Metadata by Ivan Godard] - [http://millcomputing.com/blog/wp-content/uploads/2013/12/metadata.021.pptx Slides]

Latest revision as of 21:13, 9 June 2015

All operands on the belt, additionally to the actual byte pattern that makes up the value, carry around a few bits of metadata that inform and augment how the vast majority of operations on those operands work.

This metadata is not restricted to the Belt, but is preserved in the Scratchpad and carries through the result registers in the Slots and so forth.

While the store operation looks into the metadata to know how much to store, it strips the value of it. No metadata exists in the caches and in memory. Loads, on the other hand, initialize the metadata together with the value, since load operations have the basic metadata type tags hard-coded into them.

The Metadata Fields

Scalarity

Any operand can be a SIMD vector, or slice for short, usually with 4 elements. Although the SIMD element count depends on the Specification of the processor. It can be more. The width of these elements can be any of the available widths of scalars on the processor.

Most operations are polymorphic over the Width and Scalarity tags, i.e. the same opcode performs 8bit to 128bit integer arithmetic for scalars and vectors depending on the metadata of the input operands.

Narrow and widen work for slices too, although widen produces 2 outputs with doubled width elements to avoid overflows for the maximum widths.

Width

Every operand value is tagged with its byte width, i.e. 1, 2, 4, 8, 16. The width doesn't say anything about the interpretation of the bits. An 8 byte value can serve as input for signed and unsigned integer arithmetic, for double float operations and for pointer arithmetic. How successfully those operations work out depends on the specific bit pattern and the operation semantics. NaN bit patterns can cause Faults with IEEE 754 operations and work perfectly fine as integers.

There are narrow and widen instructions to change the width of an operand.

None and NaR

Every operand, and every element in a SIMD slice, has a bit that determines whether a value is valid or not. When the actual value content of the operand is zero, this is a None value, an invalid operand that just gets ignored and/or propagated by any operation performed on it. In fact whenever a new belt is created for a new frame, this is the value all belt positions are initialized to by default.

When the operand value is something else from zero it is a NaR value and also an invalid value that gets propagated by any operation performed on it. But some operations raise a fault when they encounter a NaR.

Nones and NaRs come in very handy for Speculation and Debugging. More on that there.

IEEE 754 Floating Point Flags

The overflow and underflow and rounding behavior of floating point operations is captured in a number of flags. On conventional processors those tend to be global state flags. On the Mill that wouldn't work because global state flags introduce unnecessary data dependencies and prevent speculation. For this reason, every operand carries its own complete set floating point state flags:

  • divide by zero
  • inexact
  • invalid
  • underflow
  • overflow

As all metadata those bits are propagated through the functional units and all internal data flow whenever they occur. When they are set in operands, they are ored together in results and propagated further with the results. Only on realization, like stores, they are written into global state and trigger any of the possible Interrupts.


Rationale

Overall the effect of metadata can be described as making everything smoother and more regular, and as such easier to reason about. It curbs bloat, meaning unnecessary complexity.

The width and scalarity metadata tags massively reduce the bloat in the instruction set and make it not only denser and more effective, but also a lot more regular and logical. It even helps code reuse by introducing a form of polymorphism on the binary level.

The NaR bit in its two interpretations as an ignored None and an eventual fault eliminates the need for a lot of special and corner case code and opens up untapped reservoirs of instruction level parallelism in doing so. It makes speculative execution simple and straightforward and much more applicable. It enables efficient software Pipelining of most loops. The most limiting factor of ILP tends to be control flow, and both speculation and software pipelining together with Phasing vastly expand the windows for looking for such opportunities, across control flow borders.

Those are the main reasons for expending those few extra bits in the core, the huge payback in reduced complexity in a lot of other crucial areas. There are quite a few auxiliary benefits too, as in Debugging.

Media

Presentation on Metadata by Ivan Godard - Slides