Agner wrote:
The 512-bit registers can do vector operations
on 32-bit and 64-bit signed and unsigned integers and single and double
precision floats, but unfortunately not on 8-bit and 16-bit
integers. The latest update of Intel's manual specifies a future instruction set named AVX512BW which has vectors of 32 16-bit integers or 64 8-bit integers. See software.intel.com/en-us/intel-isa-extensions.
The AVX512 instruction set will be divided into several subsets: AVX512BW for vector instructions with 8-bit (Byte) and 16-bit (Word) granularity; AVX512DQ for 32-bit (Dword or float) and 64-bit (Qword or double) granularity; AVX512VL for the same instructions with 128 bit and 256 bit total vector length; and various other subsets.
The Skylake processor, planned for 2015, will probably support all these subsets, while the Knights Landing multiprocessor will not support the BW subset, according to this announcement software.intel.com/en-us/blogs/additional-avx-512-instructions.
A 512-bit vector with 8-bit granularity will have 64 elements and require 64-bit mask registers. The mask registers are officially 64-bit architectural registers, according to the manual. It is not clear what architectural means, but it usually means something that is guaranteed to be supported in future processors. This raises the question about the possibility of future extensions. If future extensions to 1024 or 2048 bit vectors will support 8-bit and 16-bit granularity then the mask registers must be bigger so that they can no longer communicate nicely with the 64-bit general purpose registers. If there will be future extensions of the vector size at all, either they will have only 32-bit and 64-bit granularity, or the mask registers will have to be redesigned.
|