Mill Computing, Inc. Forums The Mill Architecture The Belt Reply To: The Belt

Ivan Godard
Keymaster
Post count: 627

Where did pick() operations– and the slots that implement then — go when Gold’s spec changed?

The pick block was folded into the writer block for encoding purposes, although the pick ops could have been put in any of the exu-side blocks, and might be moved again for entropy balancing.

Which slots on the current Gold Mill can pick operations now be encoded and executed?

The only block whose content is dictated by machine timing is reader block, which decodes a cycle before the others. Reader phase ops, which execute a cycle (or two) before the others, must be in reader block to get time for dispatch. Block assignment of the other ops is essentially arbitrary as far as execution, and the only consideration is code compactness and simplicity of the decoder matrices. As currently organized, any writer block slot on any member supports pick.

I’d like to see a bit more detail about the FU and slot population on at least one member, probably Gold, since some detail on it has been revealed — and the design is evolving.

Well, you asked for it :-) From the file members/src/specGold.cc:

c->exuSlots = newRow(8);
c->exuSlots[0] < < aluFU << countFU << mulFU << shiftFU <<
shuffleFU;
c->exuSlots[1] < < aluFU << mulFU << shiftFU;
c->exuSlots[2] < < aluFU << mulFU;
c->exuSlots[3] < < aluFU << mulFU;
c->exuSlots[4] < < aluFU;
c->exuSlots[5] < < aluFU;
c->exuSlots[6] < < aluFU;
c->exuSlots[7] < < aluFU;
c->exuSlots[0] < < bfpFU << bfpmFU;
c->exuSlots[1] < < bfpFU << bfpmFU;
c->exuSlots[2] < < bfpFU << bfpmFU;
c->exuSlots[3] < < bfpFU << bfpmFU;
c->flowSlots = newRow
(8);
c->flowSlots[0] < < cacheFU << conFU << conformFU <<
controlFU << lsFU << lsbFU << miscFU;
c->flowSlots[1] < < conFU << conformFU << controlFU <<
lsFU << lsbFU << miscFU;
c->flowSlots[2] < < conFU << conformFU << controlFU <<
lsFU << lsbFU ;
c->flowSlots[3] < < conFU << conformFU << controlFU <<
lsFU << lsbFU ;
c->flowSlots[4] < < conFU << lsFU << lsbFU ;
c->flowSlots[5] < < conFU << lsFU << lsbFU ;
c->flowSlots[6] < < conFU << lsFU << lsbFU ;
c->flowSlots[7] < < conFU << lsFU << lsbFU ;

Remember: this is an untuned dummy specification.

I know the divide-helper instructions are not yet filed. So I’m hoping the Wiki will soon (hope!) give us some insight into the general core ops vs. emulated operations, if not the details on divide() itself.

The divide helper is rdiv, the reciprocal approximation op. Before you ask, there is also rroot, the square root approximation helper, too. Emulation sequences for both div and sqrt are being worked on, along with the FP and quad integer emulation sequences.