I just realized that the blur in my algorithm only happens on a single channel array, so ignore all mentions of three colors. ::sigh::

That was used in another context of the same program, and it was actually faster than separate arrays for each color, at least on x86. It is a weird algorithm indeed.

My original question still stands, though.