One interesting micro-problem that would help me understand the Mill is computing the Euclidean distance between two 3D points.
The source code might look like this:
d = sqrt(sqr(a.x - b.x) + sqr(a.y - b.y) + sqr(a.z - b.z))
Or it might be:
d = sqrt(sqr(a - b) + sqr(a - b) + sqr(a - b);
Which are naturally equivalent if the fields in the point struct used in the first form are adjacent.
The parallelism of the subtractions and squaring is obvious, and easy as vector or as separate parallel operations.
- If vectorised, can you load a non-power-of-two length vector (perhaps it puts a power of two length vector on the belt, with None in the last slot?)
- If vectorised, how do you then sum the values in the vector together?
- And if done as separate operations, do you need two sequential add operations to add them together?
- This reply was modified 9 years, 10 months ago by Will_Edwards. Reason: clarifications