Difference between revisions of "Larry's PseudoCode for Emulating Division"
From Mill Computing Wiki
(Created page with " LarryP's division pseudoCode, attempting to follow the Wikipedia Newton-Raphson algorithm: Some rough pseudocode follows. Note, I'm defaulting to the variable names used...") | (Added comments about possibly-worthwhile special cases to examine.) | ||
| (5 intermediate revisions by 2 users not shown) | |||
| Line 2: | Line 2: | ||
LarryP's division pseudoCode, attempting to follow the Wikipedia Newton-Raphson algorithm: | LarryP's division pseudoCode, attempting to follow the Wikipedia Newton-Raphson algorithm: | ||
| + | Non Millcomputing folks, please don't make changes to my pseudocode, at least not yet. | ||
| + | Instead please make a separate sibling file in the Wiki. | ||
Some rough pseudocode follows. Note, I'm defaulting to the variable names used in the Wikipedia Newton-Raphson division algorithm, but lower-cased wherever possible. | Some rough pseudocode follows. Note, I'm defaulting to the variable names used in the Wikipedia Newton-Raphson division algorithm, but lower-cased wherever possible. | ||
| − | Function (OK, really more of a macro for expansion) | + | Function (OK, really more of a macro for expansion) |
| + | <pre> | ||
'''divu(n,d) --> q, r''' | '''divu(n,d) --> q, r''' | ||
// For now, assume both n and d are | // For now, assume both n and d are | ||
| − | // | + | // (a) unsigned, |
| − | // (b) the same width | + | // (b) the same width and |
// (c) are less than 128 bits. | // (c) are less than 128 bits. | ||
| − | + | // Unless otherwise specified, all math operations are non-widening versions. | |
| + | //suspect there are some overflow checks that NEED to be added. | ||
| − | if ( | + | if (isNaR(n) || isNar(d)) {return NaR, NaR} // Handle NaR inputs |
| − | if ( | + | if (isNone(n) || isNone(d)) {return None, None} // Handle NaR inputs |
| − | /* How do we determine what width the arguments are? | + | if (0 == d) {return NaR, NaR} // Handle zero divisor |
| + | |||
| + | // There are some special cases that may be worth short circuiting, | ||
| + | // such as d == 2^k, which we detect cheaply by counting leading and trailing | ||
| + | // zeros and can do by shifting | ||
| + | // | ||
| + | // Or n<d, in which case simply return q=0, rem = n | ||
| + | // But these have to happen often enough -- and save enough cycles, by some metric, | ||
| + | // To be worth testing for them, especially on Mills without much exu-side width. | ||
| + | |||
| + | |||
| + | /* '''How do we determine what width the arguments are?''' | ||
* | * | ||
* The width matters, especially when either of the inputs | * The width matters, especially when either of the inputs | ||
| Line 35: | Line 50: | ||
d = widen(d); | d = widen(d); | ||
| + | |||
n = widen(n); // This assumes d and n are same width. MUST FIX LATER! | n = widen(n); // This assumes d and n are same width. MUST FIX LATER! | ||
| − | d = (d << lzd); | + | d = (d << lzd + 1); // I'm essentially putting the binary point at the mid-width |
| − | + | ||
| + | n = (n << lzd + 1); // of the widened input args. | ||
| − | // | + | // I want to try following the Wikipedia N-R algorithm, |
| − | + | ||
// including the suggested scaling. | // including the suggested scaling. | ||
| − | // | + | // S'''till looking for genAsm examples of width-aware code.''' |
| − | + | // Now have an implicit binary point at the midpoint of our width | |
| − | + | // And D is in the interval [1 -- 2) (can be 1, can't be 2 | |
| − | + | // with respect to our implicit binary point | |
x = rdivu(d) * n; // Initialize via rdiv*. Assumes that rdivu is better than | x = rdivu(d) * n; // Initialize via rdiv*. Assumes that rdivu is better than | ||
| − | |||
| − | + | // approximating X0 as = (48/17) - (32/17)*d | |
//******************************************************************** | //******************************************************************** | ||
| Line 63: | Line 77: | ||
t1 = d * x; | t1 = d * x; | ||
| − | t2 = (1 << (half_our_width)) - t1; // How do we determine our width? | + | |
| + | t2 = (1 << ('''half_our_width''')) - t1; // How do we determine our width? | ||
t3 = x * t2; | t3 = x * t2; | ||
| + | |||
x = x + t3; | x = x + t3; | ||
// Repeat above 4 calcs a TDB (and width-dependent!) number of times | // Repeat above 4 calcs a TDB (and width-dependent!) number of times | ||
//********************************************************************* | //********************************************************************* | ||
| + | |||
q = n * x; | q = n * x; | ||
| + | |||
q = q >> 1; // undo the "floating point style" scaling to be in the lower half word | q = q >> 1; // undo the "floating point style" scaling to be in the lower half word | ||
| + | |||
q = narrow(q); // force result back to same width as starting args | q = narrow(q); // force result back to same width as starting args | ||
return q; | return q; | ||
| − | // OPTIONALLY calc and return remainder, | + | // OPTIONALLY calc and return remainder, BUT DON'T FORGET the scaling |
| + | </pre> | ||
Latest revision as of 03:59, 22 April 2015
LarryP's division pseudoCode, attempting to follow the Wikipedia Newton-Raphson algorithm:
Non Millcomputing folks, please don't make changes to my pseudocode, at least not yet. Instead please make a separate sibling file in the Wiki.
Some rough pseudocode follows. Note, I'm defaulting to the variable names used in the Wikipedia Newton-Raphson division algorithm, but lower-cased wherever possible.
Function (OK, really more of a macro for expansion)
'''divu(n,d) --> q, r'''
// For now, assume both n and d are
// (a) unsigned,
// (b) the same width and
// (c) are less than 128 bits.
// Unless otherwise specified, all math operations are non-widening versions.
//suspect there are some overflow checks that NEED to be added.
if (isNaR(n) || isNar(d)) {return NaR, NaR} // Handle NaR inputs
if (isNone(n) || isNone(d)) {return None, None} // Handle NaR inputs
if (0 == d) {return NaR, NaR} // Handle zero divisor
// There are some special cases that may be worth short circuiting,
// such as d == 2^k, which we detect cheaply by counting leading and trailing
// zeros and can do by shifting
//
// Or n<d, in which case simply return q=0, rem = n
// But these have to happen often enough -- and save enough cycles, by some metric,
// To be worth testing for them, especially on Mills without much exu-side width.
/* '''How do we determine what width the arguments are?'''
*
* The width matters, especially when either of the inputs
* is already at max width (128 bits!!)
*
* For now, I'm assuming BOTH input args are a width were we can apply widen,
* and get a result that's
* the same number of elements as the input. This is bogus, but is a starting point.
*/
lzd = countlz(d);
if (MAX_INT_BITS == width(d)|| MAX_INT_BITS == width(n)) GOTO another algorithm
d = widen(d);
n = widen(n); // This assumes d and n are same width. MUST FIX LATER!
d = (d << lzd + 1); // I'm essentially putting the binary point at the mid-width
n = (n << lzd + 1); // of the widened input args.
// I want to try following the Wikipedia N-R algorithm,
// including the suggested scaling.
// S'''till looking for genAsm examples of width-aware code.'''
// Now have an implicit binary point at the midpoint of our width
// And D is in the interval [1 -- 2) (can be 1, can't be 2
// with respect to our implicit binary point
x = rdivu(d) * n; // Initialize via rdiv*. Assumes that rdivu is better than
// approximating X0 as = (48/17) - (32/17)*d
//********************************************************************
// X := X + X × (1 - D' × X), done without fused multiply-adds :-(
// we want NON-WIDENING multiplied here, I believe.
t1 = d * x;
t2 = (1 << ('''half_our_width''')) - t1; // How do we determine our width?
t3 = x * t2;
x = x + t3;
// Repeat above 4 calcs a TDB (and width-dependent!) number of times
//*********************************************************************
q = n * x;
q = q >> 1; // undo the "floating point style" scaling to be in the lower half word
q = narrow(q); // force result back to same width as starting args
return q;
// OPTIONALLY calc and return remainder, BUT DON'T FORGET the scaling