Any thoughts on instructions to quickly load and store bit-packed representations? They are commonly used in graphics programming, such as
here and
here, and in general should help with memory-bandwidth-bound tasks...but isn't very useful for general code, I'll admit.