- joseph.h.garvinParticipantMarch 29, 2014 at 1:47 pmPost count: 21
Perhaps I missed it, but in the lecture with the vectorized strcpy I don’t think alignment requirements were addressed. The VC++ compiler for example often has to emit prelude code before vectorized loops to make sure that char pointers are going to aligned memory before doing SSE operations with them. Is this necessary on a Mill?
VC++ also has to emit code to deal with ‘left over’ elements when the array/string is not a length that is a multiple of the vector width, but it appears that the None value makes that unnecessary on the Mill.
- Ivan GodardKeymasterMarch 29, 2014 at 4:16 pmPost count: 689
The Mill does support unaligned memory access, but as you might expect using them doubles the memory bandwidth requirements, which can lead to poorer power-performance.
There are a pair of Mill-unique operations that can be used to get a vector loop started with proper alignment. As you also might expect, they produce an initial vector with leading Nones. There wasn’t enough time to cover them in the Metadata talk, but how they work should be fairly obvious.
You must be logged in to reply to this topic.