Mill Computing, Inc

Participant

August 13, 2014 at 5:55 am

Post count: 78

Greetings all,
Where did pick() operations– and the slots that implement then — go when Gold’s spec changed?

The an earlier description of Mill slots and pilelines:
http://millcomputing.com/topic/introduction-to-the-mill-cpu-programming-model-2/#post-610
(Will, thanks for citing that!) Art mentions,
“4 pipes that can do pick operations”

In Gold in particular — since Gold’s a specific, evolving member for which we have some info revealed —
Which slots on the current Gold Mill can pick operations now be encoded and executed?

My guess is that pick Ops remain in the Mill ISA, but have been subsumed into other slots (perhaps exu-side?) I note that in Ivan’s latest slot-tally for Gold, he doesn’t break the slot functions down to as fine a level. In the earlier description, there were examples of further breakdown by function, for example how many exu-side slots could do multiply(), vs. how many others that could do simpler ALU ops, but not multiply().

If possible after considering NYF and and Guru time/effort, I’d like to see a bit more detail about the FU and slot population on at least one member, probably Gold, since some detail on it has been revealed — and the design is evolving.

M evolution, and motivation for the changes will IMHO give readers some insight about Mill performance and the sorts of trade-off that have been made — and can be made — using MillComputing’s specification and simulation tools.

One thing I’d like to see in the Wiki is a description of the instructions and any constraints they impose. (Ivan has mentioned some issues with potential FMA hazards, that the scheduler has to avoid.) Ideally, such wiki entries would automatically generated.) But if not, than user-supplied explanations, eventually corrected by the MillComuting team, would give users a better understanding of what the Mill can do, especially at the abstract Mill level.

Theoretically, the Mill could have a very few operations as its core ISA, with a larger subset emulated with series of the core operation. While this could be taken to extreme (how many operations does the classic Turing machine have?) I expect that for performance reasons — and the Mill is a performance-based sale — the Mill will define a sizeable core subset of Ops that are implemented very efficiently on all members (in hardware, most likely), instead of as graph substitution of (longer sequences of) core ops. I’ve read hints that full divide (or was it just floating-point divides) that were probably not going to be Ops in their own right, but emulated via substitution of divide-helper functions, that were expected to be in the non-emulated core. I know the divide-helper instructions are not yet filed. So I’m hoping the Wiki will soon (hope!) give us some insight into the general core vs. emulated operations, if not divide() itself.

Reply To: The Belt