Forum Replies Created
- AuthorPosts
Protection Lookaside Buffer
On the Mill memory protection and virtual memory translation are separate systems. This allows, among other things, to make both protection and translation cheaper and faster by putting the protection structures on top of the caches and the memory translation below the caches. Actually there are two Protection Lookaside Buffers, iPLB and dPLB covering code and data caches respectively. The iPLB holds execute and portal call permission, the dPLB read and write access.
operation
Since the Mill is a wide issue architecture it is very necesary to distinguish between instructions and operations. An operation is a semantically distinct piece of computation. An instruction is a collection of one ore more operations that get issued together. This tends to be the same on mainstream general purpose architectures.
Not a Result
Another metadata value indicating a real fault in a previous operation that should be raised whenever this data is actually realized to memory. Values with NaR metadata hold information about the nature and location of the fault to aid debugging.
- This reply was modified 10 years, 8 months ago by imbecile.
metadata
All data kept in belt slots and the scratchpad is annotated with metadata bits. This metadata is initialized on value creation and also doesn’t change over its lifetime. The most important information kept in there is the scalar data width, whether it is a SIMD vector and whether it is a valid value or not. It also carries the floating point state bits of floating point operations and possibly more. This information is used throughout the machine to augment the operations performed on the data, like inferring operand widths, how to handle overflows, how to propagate values resulting from speculative computations etc.
instruction stream
The Mill has 2 separate instuction streams that operate in sync. They are divided by functionality, one being the Exu-stream for computation and one being the Flow stream for control flow and memory access and address logic. Consequently there are two program counters, XPC and FPC. On control transfers both go to the same address and then diverge, XPC going down the address space, FPC going upward.
The main reason for this arrangement is simpler, cheaper and faster decoding of instructions and more efficient use of caches.instruction
A collection of operations that get issued together. On the Mill an instruction is divided into two half-bundles, one for each instruction stream, and each half-bundle into a header and 3 blocks of operations, which correspond to the execution phases. One block contains one or more fixed length operations.
exit
An operation that transfers control to a new EBB. This can be a branch or a jump or a call, with immediate addresses or from a belt slot. Calls don’t neccessarily need to be exists though when taken.
Exits are also what is kept in the prediction tables, one for each EBB, instead of each branch.Extended Basic Block
A sequence or batch of instructions that logically belong together. They have one entry point, and can only be entered at that entry point, and one or more exit points. On the Mill there is no implicit exiting an EBB, so every ends with an explicit control flow operation. The EBB plays a central role for organizing code on the Mill. For one, jump addresses are encoded relative to the start of the current EBB. Branch prediction works on EBB exits instead of branches.
bundle
Bundles are a collection of instructions that are fetched from memory together. On the Mill there is not a single bundle pulled at a time but two half-bundles are pulled at a time which together comprise a single very long instruction and can contain over 30 distinct operations. Those two half-bundles are also decoded and issued in sync and together as a very long instruction word would. Often a whole EBB consists of a single such instruction bundle, containing all necessary operations.
belt slot
Serves as a temprary memory to hold operation results and privide operation arguments. Each belt slot has at least 64bit but usually 128bit and additionally a few metadata bits. A belt slot can hold any width data, even vector data, which is indicated by the metadata flags and initialized on load. Belt slots are accesses via the index of their position on the belt. Whenever new values get dropped on the belt that index gets incremented.
belt
The belt is the Mill operand holding/interchange device. It is a fifo of a fixed length of 8, 16 or 32 depending on the specific chip. Operations pick any arguments from the belt and drop the results at the front, removing equally many values from the back in the process. One operatiuon can drop multiple results. While on the belt, values never change, removing any write/read hardware hazards.
- in reply to: Programmer oriented overview #907
I think a glossary would be very useful for newcomers. There are a lot of terms that are used very specifically in the Mill context that could mean different things for hardware people and for software people or just in general.
Although maybe it would be better to have it in a thread here, so it can grow. That would be a kludge though, since there is no wiki.
If it is really done in the forum, then the original post in the glossary thread should be the alphabetical list in kind of this format:
term – short definition
followed by posts that give longer definitions and explanations.
Maybe I just start, if there is a wiki some day this could serve as a starting point or seed.
- AuthorPosts