Agner`s CPU blog

Software optimization resources | E-mail subscription to this blog | www.agner.org

Test results for AMD Bulldozer processor
Author:  Date: 2012-03-14 09:07
Massimo wrote:
* What do you think about the L1D WT choice with higher latency (coupled with a WCC halfaway the L2)? Does it impact much the speed for you?
In his analysis Agner also wrote about an instruction-throughput penalty with both cores active. Instead of 4 instructions per clock, he could only measure around ~3 instr. per clock on average. I speculate that this is the effect of the L1's WT strategy. Because of WT, stores have to be send to the L2, but the L2 can probably only handle *one* store instruction per clock, not 2. Thus, only 3 instr. instead of 4 per module. Agner also reported a maximum of ~3.6-3.7 instructions. Maybe he got more loads than the usual 2:1 load to store ratio in that case. But I dont know his code so I cant say for sure, only speculate.
 
thread Test results for AMD Bulldozer processor new - Agner - 2012-03-02
replythread Test results for AMD Bulldozer processor new - Massimo - 2012-03-13
reply Test results for AMD Bulldozer processor new - Agner - 2012-03-14
last reply Test results for AMD Bulldozer processor - Alex - 2012-03-14
replythread Test results for AMD Bulldozer processor new - fellix - 2012-03-15
last replythread Test results for AMD Bulldozer processor new - Agner - 2012-03-16
last replythread Test results for AMD Bulldozer processor new - Massimo - 2012-03-16
last replythread Test results for AMD Bulldozer processor new - Agner - 2012-03-17
reply Test results for AMD Bulldozer processor new - avk - 2012-03-17
last replythread Test results for AMD Bulldozer processor new - Massimo - 2012-03-17
last replythread Test results for AMD Bulldozer processor new - Agner - 2012-03-17
last replythread Test results for AMD Bulldozer processor new - Massimo - 2012-03-20
last replythread Test results for AMD Bulldozer processor new - Agner - 2012-03-21
last reply Cache WT performance of the AMD Bulldozer CPU new - GordonBGood - 2012-06-05
reply Test results for AMD Bulldozer processor new - zan - 2012-04-03
replythread Multithreads load-store throughput for bulldozer new - A-11 - 2014-06-27
last replythread Multithreads load-store throughput for bulldozer new - Bigos - 2014-06-28
last reply Multithreads load-store throughput for bulldozer new - A-11 - 2014-07-04
last reply Store forwarding stalls of piledriver new - A-11 - 2014-09-07