Difference between revisions of "Larry's PseudoCode for Emulating Division"

From Mill Computing Wiki
Jump to: navigation, search
(Created page with " LarryP's division pseudoCode, attempting to follow the Wikipedia Newton-Raphson algorithm: Some rough pseudocode follows. Note, I'm defaulting to the variable names used...")
 
(Added comments about possibly-worthwhile special cases to examine.)
 
(5 intermediate revisions by 2 users not shown)
Line 2:Line 2:
 
LarryP's division pseudoCode, attempting to follow the Wikipedia Newton-Raphson algorithm:  
 
LarryP's division pseudoCode, attempting to follow the Wikipedia Newton-Raphson algorithm:  
  
 +
Non Millcomputing folks, please don't make changes to my pseudocode, at least not yet. 
 +
Instead please make a separate sibling file in the Wiki.
  
 
Some rough pseudocode follows.  Note, I'm defaulting to the variable names used in the Wikipedia Newton-Raphson division algorithm, but lower-cased wherever possible.   
 
Some rough pseudocode follows.  Note, I'm defaulting to the variable names used in the Wikipedia Newton-Raphson division algorithm, but lower-cased wherever possible.   
  
Function (OK, really more of a macro for expansion)  
+
Function (OK, really more of a macro for expansion)
  
 +
<pre>
 
'''divu(n,d) --> q, r'''
 
'''divu(n,d) --> q, r'''
  
 
// For now, assume both n and d are  
 
// For now, assume both n and d are  
// (a) unsigned,  
+
// (a) unsigned,  
// (b) the same width and  
+
// (b) the same width and  
 
// (c) are less than 128 bits.
 
// (c) are less than 128 bits.
  
if (isNaR(n) || isNar(d)) {return NaR, NaR}              // Handle NaR inputs
+
// Unless otherwise specified, all math operations are non-widening versions.
 +
//suspect there are some overflow checks that NEED to be added.
  
if (isNone(n) || isNone(d)) {return None, None} // Handle NaR inputs
+
if (isNaR(n) || isNar(d)) {return NaR, NaR} // Handle NaR inputs
  
if (0 == d) {return NaR, NaR}                                           // Handle zero divisor
+
if (isNone(n) || isNone(d)) {return None, None} // Handle NaR inputs
  
/* How do we determine what width the arguments are?
+
if (0 == d) {return NaR, NaR}  // Handle zero divisor
 +
 
 +
// There are some special cases that may be worth short circuiting,
 +
// such as d == 2^k, which we detect cheaply by counting leading and trailing
 +
// zeros and can do by shifting
 +
//
 +
// Or n<d, in which case simply return q=0, rem = n
 +
// But these have to happen often enough -- and save enough cycles, by some metric,
 +
// To be worth testing for them, especially on Mills without much exu-side width.
 +
 
 +
 
 +
/* '''How do we determine what width the arguments are?'''
 
  *
 
  *
 
  * The width matters, especially when either of the inputs  
 
  * The width matters, especially when either of the inputs  
Line 35:Line 50:
  
 
d = widen(d);
 
d = widen(d);
 +
 
n = widen(n);    // This assumes d and n are same width.  MUST FIX LATER!
 
n = widen(n);    // This assumes d and n are same width.  MUST FIX LATER!
  
d = (d << lzd);
+
d = (d << lzd + 1); // I'm essentially putting the binary point at the mid-width
n = (n << lzd); 
+
  
 +
n = (n << lzd + 1); // of the widened input args.
  
  
// The following is a hack (needing a second shift),
+
// I want to try following the Wikipedia N-R algorithm,  
// but I want to try following the Wikipedia N-R algorithm,  
+
 
// including the suggested scaling.
 
// including the suggested scaling.
// Still looking for genAsm examples of width-aware code.
+
// S'''till looking for genAsm examples of width-aware code.'''
  
n = shiftLeft(n, 1);    // Now have an implicit binary point at the midpoint of our width
+
// Now have an implicit binary point at the midpoint of our width
d = shiftLeft(d, 1);    // And D is in the interval [1 -- 2) (can be 1, can't be 2
+
// And D is in the interval [1 -- 2) (can be 1, can't be 2
                                        // with respect to our implicit binary point
+
// with respect to our implicit binary point
  
 
x = rdivu(d) * n;      // Initialize via rdiv*.  Assumes that rdivu is better than  
 
x = rdivu(d) * n;      // Initialize via rdiv*.  Assumes that rdivu is better than  
                                      // approximating X0 as = (48/17) - (32/17)*d
 
  
                                      // I don't think we want a widening multiply; must check
+
                        // approximating X0 as = (48/17) - (32/17)*d
  
 
//********************************************************************
 
//********************************************************************
Line 63:Line 77:
  
 
t1 = d * x;       
 
t1 = d * x;       
t2 = (1 << (half_our_width)) - t1;  // How do we determine our width?
+
 
 +
t2 = (1 << ('''half_our_width''')) - t1;  // How do we determine our width?
  
 
t3 = x * t2;
 
t3 = x * t2;
 +
 
x = x + t3;  
 
x = x + t3;  
  
 
// Repeat above 4 calcs a TDB (and width-dependent!) number of times
 
// Repeat above 4 calcs a TDB (and width-dependent!) number of times
 
//*********************************************************************
 
//*********************************************************************
 +
 
q = n * x;
 
q = n * x;
 +
 
q = q >> 1;    // undo the "floating point style" scaling to be in the lower half word
 
q = q >> 1;    // undo the "floating point style" scaling to be in the lower half word
 +
 
q = narrow(q); // force result back to same width as starting args
 
q = narrow(q); // force result back to same width as starting args
  
 
return q;  
 
return q;  
  
// OPTIONALLY calc and return remainder, INCLUDING scaling
+
// OPTIONALLY calc and return remainder, BUT DON'T FORGET the scaling
 +
</pre>

Latest revision as of 03:59, 22 April 2015

LarryP's division pseudoCode, attempting to follow the Wikipedia Newton-Raphson algorithm:

Non Millcomputing folks, please don't make changes to my pseudocode, at least not yet. Instead please make a separate sibling file in the Wiki.

Some rough pseudocode follows. Note, I'm defaulting to the variable names used in the Wikipedia Newton-Raphson division algorithm, but lower-cased wherever possible.

Function (OK, really more of a macro for expansion)

'''divu(n,d) --> q, r'''

// For now, assume both n and d are 
// (a) unsigned, 
// (b) the same width and 
// (c) are less than 128 bits.

// Unless otherwise specified, all math operations are non-widening versions.
//suspect there are some overflow checks that NEED to be added.

if (isNaR(n) || isNar(d)) {return NaR, NaR} // Handle NaR inputs

if (isNone(n) || isNone(d)) {return None, None} // Handle NaR inputs

if (0 == d) {return NaR, NaR}   // Handle zero divisor

// There are some special cases that may be worth short circuiting, 
// such as d == 2^k, which we detect cheaply by counting leading and trailing 
// zeros and can do by shifting
//
// Or n<d, in which case simply return q=0, rem = n
// But these have to happen often enough -- and save enough cycles, by some metric,
// To be worth testing for them, especially on Mills without much exu-side width. 


/* '''How do we determine what width the arguments are?'''
 *
 * The width matters, especially when either of the inputs 
 * is already at max width (128 bits!!) 
 *
 * For now, I'm assuming BOTH input args are a width were we can apply widen, 
 * and get a result that's 
 * the same number of elements as the input.  This is bogus, but is a starting point.
 */

lzd = countlz(d);

if (MAX_INT_BITS == width(d)|| MAX_INT_BITS == width(n)) GOTO another algorithm

d = widen(d);

n = widen(n);    // This assumes d and n are same width.  MUST FIX LATER!

d = (d << lzd + 1); // I'm essentially putting the binary point at the mid-width 

n = (n << lzd + 1); // of the widened input args. 


// I want to try following the Wikipedia N-R algorithm, 
// including the suggested scaling.
// S'''till looking for genAsm examples of width-aware code.'''

// Now have an implicit binary point at the midpoint of our width
// And D is in the interval [1 -- 2) (can be 1, can't be 2
// with respect to our implicit binary point

x = rdivu(d) * n;       // Initialize via rdiv*.  Assumes that rdivu is better than 

                        // approximating X0 as = (48/17) - (32/17)*d 

//********************************************************************

// X := X + X × (1 - D' × X), done without fused multiply-adds :-(

// we want NON-WIDENING multiplied here, I believe.

t1 = d * x;      

t2 = (1 << ('''half_our_width''')) - t1;  // How do we determine our width?

t3 = x * t2;

x = x + t3; 

// Repeat above 4 calcs a TDB (and width-dependent!) number of times
//*********************************************************************

q = n * x;

q = q >> 1;    // undo the "floating point style" scaling to be in the lower half word

q = narrow(q); // force result back to same width as starting args

return q; 

// OPTIONALLY calc and return remainder, BUT DON'T FORGET the scaling