- mermericoModeratorDecember 11, 2017 at 7:35 amPost count: 10
Ivan frequently says that ALUs are 10% of the cost of a modern CPU in terms of die and power. I’m trying to get more information about this and square it with other pieces of information from the industry. Is this quote for a 64 bit wide ALU or a modern 256-512 bit ALU? If it’s the former, then does this scale linearly with bits, such that a 512 bit ALU takes up ~50% of the power and area? This would explain why other companies seem to go to great lengths to share ALUs- either by SMT (IBM Power, Intel, AMD Zen) or between two physical cores (AMD Bulldozer family). It would also explain why Intel has to clock down its processors and ~doubles its power consumption whenever it use AVX-512.
An alternative explanation is that ALUs take up a small part of the power and area in these systems, but the wide registers needed to feed these ALUs take up a large chunk of power and area, and that Mill machines will be able to get around this.
Which view is closer to the truth?
- Ivan GodardKeymasterDecember 12, 2017 at 11:35 amPost count: 689
I had to ask the hardware guys about this (IANAHG). They say:
For add/subtract, the power increase with element size N bits is roughly proportional to N*log3(N).
For shift/rotate, N*log2(N).
For multiply, N*N (or N*N/4 with a Booth first stage).
Power increases linearly with number of elements.
These are only very rough, order of magnitude accuracy power terms, meant to give an intuitive feel.
AVX-512 contains not only ALUs, but shifters and multipliers as well. When the multipliers kick in is when the lights dim.
The big power win on the Mill is in getting rid of all the OOO machinery and the huge number of registers.
The configuration machinery separately specifies the number of ALUs, multipliers, and divide/sqrt units so configurations can be tuned to workload and the power/performance trade-off.
You must be logged in to reply to this topic.