Now that Knights Landing is officially within Intel's lineup, I'm feeling quite curious about the first AVX512 VPUs performance.
Maybe, since that AVX512 behaves much like an alias for the original IMCI ISA, the timing measuraments for KNL's VPUs (except for the doubled throughput) would't differ much from those from the KNC VPU Ref : "Test-driving intel xeon phi" : https://research.spec.org/icpe_proceedings/2014/p137.pdf -> It mentions the usage of a very similar method as the one for the x86 Instruction Timing tables. One interesting thing to notice is that even the Vector Logical instructions have at least a 2 cycle latency, proabably due to the vector mask stage. |