Mill Computing, Inc. › Forums › The Mill › Architecture › MIPS/sqrt(W*$) as a better metric
- AuthorPosts
- #953 |
In one of the early videos, MIPS/W/$ is used as a metric which resulted in discussions on reddit and comp.arch. The problem is that while MIPS/W and MIPS/$ are useful as metrics, MIPS/W/$ does not scale well. In other words, a CPU that does half the MIPS for half the watts and half the cost would do twice as well in the MIPS/W/$ metric. I think using a square root in the denominator fixes this problem and I would like to explain why.
Let’s start with MIPS/W. How can this one number provide a useful metric over a whole range of devices? Well, you can plot data points on a graph and I don’t think it really matters if you use a log scale or if you plot 1/W instead of W. These data points will define a curve or envelope with the feature that high-end devices would be consuming proportionally more power than they increase performance. For example, a 10% increase in performance requires more than a 10% increase in power consumption.
Theoretically there should be a sweet spot on this curve where a 10% increase in performance requires a 10% increase in power. This would mean that you could replace 10 of the slower devices with 9 faster ones and get almost the same performance for the same amount of power. This is what the MIPS/W metric would measure. Obviously there are problems with this, especially in the case of devices with a newer more efficient but higher cost process but we will try to deal with those later.
You can do a similar analysis with MIPS/$ as a metric although power efficiency is even harder to ignore in this case. Cheap chips that suck power are not really cheaper.
Let’s try one more scenario. Hold MIPS constant and look at the power consumption and cost. In this case you are trying to minimize both W and $. You have some efficient but high cost devices and some cheap but inefficient devices. Somewhere in the middle you are looking for devices that are both low cost and efficient. It is somewhat arbitrary, but again you could look for the sweet spot where a 10% increase in cost gets you a 10% decrease in power consumption. This can be found by minimizing W*$.
This may not be the actual price point you are looking for if, for example, you decide that the long-term power costs outweigh the up-front cost. But you will still probably be close to the W*$ sweet spot which tends to define the curve in its vicinity. In general, devices with a lower W*$ will perform better on a similar metric.
So the question becomes how do you combine MIPS/W, MIPS/$, and W*$ into a single metric that allows you to fairly compare different families of devices, allows for scaling, and is not too arbitrary. One solution might be (MIPS/W)*(MIPS/$) but I think the squared MIPS term would be a problem. A better solution would be MIPS/sqrt(W*$) which is the geometric mean of MIPS/W and MIPS/$ and also minimizes W*$ at constant MIPS.
Looking back, I noticed a couple of people mentioned this solution in the reddit discussion although it was never expanded upon. In particular, Orlandu63 said: “I think originally the unit was mips/sqrt(W*$), in other words, mips divided by the geometric mean of W and $ (because the geometric mean has the useful property of normalizing the ranges of the terms). But eventually the square root was omitted (probably for convenience), leading to mips/W/$. ”
I would be curious to see this with real numbers which I’ve assumed you’ve been collecting for this purpose, although if you’ve decided to drop this for now until you have a product to sell I can see why. In any case, the videos exist and it’s probably good idea to have a positive discussion on this site of the meaningfulness of this metric. If you don’t find this discussion to be constructive feel free to delete this thread. I will understand.
We stopped using the metric due to lack of hard data and because too many people were confused by it; the concept of a design space (from which the metric came), as opposed to benchmark comparisons, turned out to be too much for many of the audience, and the resulting wrangling was pointlessly distracting.
We still don’t have better data, but have no objection to thoughtful posts on the subject, such as yours.
In my opinion a very good solution is to express Power in terms of Cost and then optimize MIPS/total_cost.
Cost would then include the power of the chip, the cooling for the chip… even the floorspace cost if you think about it.
You can only put so many Watts on a square meter of floor space.There is even application specific cost, since the administrative effort of using more cores or chips can differ per application.
It might also make sense to add maintenance cost, since the MTBF differs too if you put 80 fast chips in a rack vs. 800 slow ones.- AuthorPosts
You must be logged in to reply to this topic.