Agner`s CPU blog

Software optimization resources | E-mail subscription to this blog | www.agner.org

Test results for Intel's Sandy Bridge processor
Author:  Date: 2015-08-25 11:58
If it were only a matter of arithmetic I would also expect the code to run at 1/2 speed when using only the lower 128-bit pipe. However on Haswell the transfer of data from the upper 128 bits of the AVX registers to the lower 128 bits has a 3-cycle latency, and although this can be fully pipelined in software (at 1 op/cycle), it is easy to believe that a hardware emulation mode that is only intended to be run for a minute fraction of the total cycles might not fully pipeline the "cross-lane" transfer across multiple instructions.

The uop count is a matter of how the engineers chose to implement the feature. If the implementation is internal to the functional unit, then it would not require extra uops, and I do not see any significant change in uop counts between the "slow" and "normal" phases. (The uop counts are elevated for the iteration that includes the transition, but it is not at all clear what is happening in that step.)

I think I forgot to mention that, just as you noticed on Sandy Bridge, there are no "warm-up" effects when using scalar AVX operations or 128-bit SSE operations. (I did not check 128-bit AVX, but there is not a lot of reason for that to be different than 128-bit SSE.) My assumption that the data is running though the "lower" 128-bit pipe is based in part on the observation that the 128-bit pipeline is available at full speed at all times. From an implementation perspective running the 256-bit operations at a lower frequency does not make a lot of sense when there is a full-speed 128-bit pipeline ready to use.

 
thread Test results for Intel's Sandy Bridge processor new - Agner - 2011-01-30
reply Test results for Intel's Sandy Bridge processor new - PaulR - 2011-02-15
replythread AVX2 new - phis - 2011-06-23
last reply AVX2 new - Agner - 2011-06-23
replythread Test results for Intel's Sandy Bridge processor new - anon - 2013-08-01
last replythread Test results for Intel's Sandy Bridge processor new - Agner - 2013-08-06
last replythread Test results for Intel's Sandy Bridge processor new - anon - 2013-08-07
last replythread Test results for Intel's Sandy Bridge processor new - Agner - 2013-08-07
last replythread Test results for Intel's Sandy Bridge processor new - anon - 2013-08-07
last replythread Test results for Intel's Sandy Bridge processor new - Agner - 2013-08-08
last replythread Test results for Intel's Sandy Bridge processor new - anon - 2013-08-08
last replythread Test results for Intel's Sandy Bridge processor new - Agner - 2013-08-09
last replythread Test results for Intel's Sandy Bridge processor new - anon - 2013-08-09
last replythread Test results for Intel's Sandy Bridge processor new - Agner - 2013-08-10
last reply Test results for Intel's Sandy Bridge processor new - Agner - 2013-08-10
replythread Test results for Intel's Sandy Bridge processor new - John D. McCalpin - 2013-10-09
last replythread Test results for Intel's Sandy Bridge processor new - Agner - 2013-10-10
last replythread Test results for Intel's Sandy Bridge processor new - John D. McCalpin - 2013-10-11
last replythread SB's L1D banks new - Tacit Murky - 2013-11-03
last reply SB's L1D banks new - John D. McCalpin - 2013-11-07
replythread Test results for Intel's Sandy Bridge processor new - John D. McCalpin - 2015-08-18
replythread Test results for Intel's Sandy Bridge processor new - Agner - 2015-08-18
last replythread Test results for Intel's Sandy Bridge processor new - John D. McCalpin - 2015-08-24
last replythread Test results for Intel's Sandy Bridge processor new - Agner - 2015-08-25
last reply Test results for Intel's Sandy Bridge processor - John D. McCalpin - 2015-08-25
replythread Haswell upper128 power gating new - Peter Cordes - 2015-08-28
last replythread Haswell upper128 power gating new - Agner - 2016-01-16
last replythread Haswell upper128 power gating new - John D. McCalpin - 2016-01-29
last reply Haswell upper128 power gating new - Agner - 2016-01-30
last replythread Test results for Intel's Sandy Bridge processor new - Agner - 2015-12-20
last replythread Test results for Intel's Sandy Bridge processor new - John D. McCalpin - 2015-12-21
last replythread Test results for Intel's Sandy Bridge processor new - Agner - 2015-12-22
reply Test results for Intel's Sandy Bridge processor new - Robert - 2015-12-24
last replythread Test results for Intel's Sandy Bridge processor new - Just_Coder - 2015-12-25
last reply Test results for Intel's Sandy Bridge processor new - Agner - 2015-12-26
last replythread Test results for Intel's Sandy Bridge processor new - Just_Coder - 2015-08-23
last reply Test results for Intel's Sandy Bridge processor new - Agner - 2015-08-25