Difference between revisions of "Larry's PseudoCode for Emulating Division"

From Mill Computing Wiki
Jump to: navigation, search
(Added comments about possibly-worthwhile special cases to examine.)
 
(2 intermediate revisions by 2 users not shown)
Line 4:Line 4:
 
Non Millcomputing folks, please don't make changes to my pseudocode, at least not yet.   
 
Non Millcomputing folks, please don't make changes to my pseudocode, at least not yet.   
 
Instead please make a separate sibling file in the Wiki.
 
Instead please make a separate sibling file in the Wiki.
 
I'm fighting the Wiki's formatting; probably best to view the source, rather than the Wikified version.
 
  
 
Some rough pseudocode follows.  Note, I'm defaulting to the variable names used in the Wikipedia Newton-Raphson division algorithm, but lower-cased wherever possible.   
 
Some rough pseudocode follows.  Note, I'm defaulting to the variable names used in the Wikipedia Newton-Raphson division algorithm, but lower-cased wherever possible.   
  
Function (OK, really more of a macro for expansion)  
+
Function (OK, really more of a macro for expansion)
  
 +
<pre>
 
'''divu(n,d) --> q, r'''
 
'''divu(n,d) --> q, r'''
  
 
// For now, assume both n and d are  
 
// For now, assume both n and d are  
// (a) unsigned,  
+
// (a) unsigned,  
// (b) the same width and  
+
// (b) the same width and  
 
// (c) are less than 128 bits.
 
// (c) are less than 128 bits.
  
Line 26:Line 25:
  
 
if (0 == d) {return NaR, NaR}  // Handle zero divisor
 
if (0 == d) {return NaR, NaR}  // Handle zero divisor
 +
 +
// There are some special cases that may be worth short circuiting,
 +
// such as d == 2^k, which we detect cheaply by counting leading and trailing
 +
// zeros and can do by shifting
 +
//
 +
// Or n<d, in which case simply return q=0, rem = n
 +
// But these have to happen often enough -- and save enough cycles, by some metric,
 +
// To be worth testing for them, especially on Mills without much exu-side width.
 +
  
 
/* '''How do we determine what width the arguments are?'''
 
/* '''How do we determine what width the arguments are?'''
Line 69:Line 77:
  
 
t1 = d * x;       
 
t1 = d * x;       
 +
 
t2 = (1 << ('''half_our_width''')) - t1;  // How do we determine our width?
 
t2 = (1 << ('''half_our_width''')) - t1;  // How do we determine our width?
  
 
t3 = x * t2;
 
t3 = x * t2;
 +
 
x = x + t3;  
 
x = x + t3;  
  
Line 86:Line 96:
  
 
// OPTIONALLY calc and return remainder, BUT DON'T FORGET the scaling
 
// OPTIONALLY calc and return remainder, BUT DON'T FORGET the scaling
 +
</pre>

Latest revision as of 03:59, 22 April 2015

LarryP's division pseudoCode, attempting to follow the Wikipedia Newton-Raphson algorithm:

Non Millcomputing folks, please don't make changes to my pseudocode, at least not yet. Instead please make a separate sibling file in the Wiki.

Some rough pseudocode follows. Note, I'm defaulting to the variable names used in the Wikipedia Newton-Raphson division algorithm, but lower-cased wherever possible.

Function (OK, really more of a macro for expansion)

'''divu(n,d) --> q, r'''

// For now, assume both n and d are 
// (a) unsigned, 
// (b) the same width and 
// (c) are less than 128 bits.

// Unless otherwise specified, all math operations are non-widening versions.
//suspect there are some overflow checks that NEED to be added.

if (isNaR(n) || isNar(d)) {return NaR, NaR} // Handle NaR inputs

if (isNone(n) || isNone(d)) {return None, None} // Handle NaR inputs

if (0 == d) {return NaR, NaR}   // Handle zero divisor

// There are some special cases that may be worth short circuiting, 
// such as d == 2^k, which we detect cheaply by counting leading and trailing 
// zeros and can do by shifting
//
// Or n<d, in which case simply return q=0, rem = n
// But these have to happen often enough -- and save enough cycles, by some metric,
// To be worth testing for them, especially on Mills without much exu-side width. 


/* '''How do we determine what width the arguments are?'''
 *
 * The width matters, especially when either of the inputs 
 * is already at max width (128 bits!!) 
 *
 * For now, I'm assuming BOTH input args are a width were we can apply widen, 
 * and get a result that's 
 * the same number of elements as the input.  This is bogus, but is a starting point.
 */

lzd = countlz(d);

if (MAX_INT_BITS == width(d)|| MAX_INT_BITS == width(n)) GOTO another algorithm

d = widen(d);

n = widen(n);    // This assumes d and n are same width.  MUST FIX LATER!

d = (d << lzd + 1); // I'm essentially putting the binary point at the mid-width 

n = (n << lzd + 1); // of the widened input args. 


// I want to try following the Wikipedia N-R algorithm, 
// including the suggested scaling.
// S'''till looking for genAsm examples of width-aware code.'''

// Now have an implicit binary point at the midpoint of our width
// And D is in the interval [1 -- 2) (can be 1, can't be 2
// with respect to our implicit binary point

x = rdivu(d) * n;       // Initialize via rdiv*.  Assumes that rdivu is better than 

                        // approximating X0 as = (48/17) - (32/17)*d 

//********************************************************************

// X := X + X × (1 - D' × X), done without fused multiply-adds :-(

// we want NON-WIDENING multiplied here, I believe.

t1 = d * x;      

t2 = (1 << ('''half_our_width''')) - t1;  // How do we determine our width?

t3 = x * t2;

x = x + t3; 

// Repeat above 4 calcs a TDB (and width-dependent!) number of times
//*********************************************************************

q = n * x;

q = q >> 1;    // undo the "floating point style" scaling to be in the lower half word

q = narrow(q); // force result back to same width as starting args

return q; 

// OPTIONALLY calc and return remainder, BUT DON'T FORGET the scaling