Mill Computing, Inc

Participant

June 9, 2014 at 9:26 am

Post count: 5

I think this is the wrong venue in which to bring this up, but it’s the most correct venue I can think of. In other venues, you’ve asked what instructions could you add to low-end Mills that would make writing a software floating point library easy(-ier). I figured I’d chime in with my thoughts on the matter, for what they’re worth.

The biggest problem, and the most error prone aspect, of writing a software floating point library, is unpacking and repacking the floating point numbers themselves, and dealing with the edge cases (denormalized numbers, infinities, NaNs). Unpacking a floating point number into it’s components is a longish series of shifts, ands, and adds. The idea is to just have two instructions. One instruction takes a floating point number and unpacks it, producing the mantissa, exponent, and type (is it a NaN, Infinity, etc.?). As a side note, the belt is really nice for designing instructions. 🙂 The other instruction takes a mantissa, exponent, and type and reconstitutes the original floating point number.

Note that these instructions are useful in other circumstances, not just in software fp libraries, but in functions lke frexp and ldexp and isNaN. These would actually be useful instructions to have even on a Mill Platinum architecture, aimed at super computers and given boatloads of floating point units. Their utility in writing a software FP library should be obvious.

Two variants might warrant consideration. One is that instead of a single type value, drop multiple flags on the belt- isNaN, isInfinity, etc. The advantage of this is that in a single instruction you could unpack a floating point number and multiway branch on the edge cases to special handling code (“Oh, it’s a NaN- in that case, jump here.”). Another would be instead of the mantissa as a single value, split it up into two values. If you don’t have a 64 bit x 64 bit -> 128 bit multiply (likely on a low-end Mill), this makes it easier to do the obvious four 32×32->64 bit multiplies. The problem with both of these variants is that they drop a lot more values on the belt- if you only have 8 belt positions, having a single operation drop 6 or more values on the belt can be a problem.

Hope this helps.

Reply To: Specification