VTBL
If you can wait until next year then Intel's Haswell CPUs will have AVX2 which includes instructions for gathered loads. This enables you to do e.g. 8 parallel LUT lookups in one instruction (see e.g. VGATHERDPS). Other than that, you're out of luck, unless your LUTs are quite small (e.g. 16 elements), in which case you can usePSHUFB.
댓글 없음:
댓글 쓰기