Agner`s CPU blog

Software optimization resources | E-mail subscription to this blog | www.agner.org

SIMD exceptions are fine with masking
Author:  Date: 2016-08-19 15:37
Agner wrote:
The following methods for detecting integer overflow may be considered:
  1. Use a flag output. This gives every ALU instruction an extra output dependency, and also an extra input dependency if you want the overflow bit to be propagated through a series of calculations. The extra input and output dependencies hamper out-of-order processing.
  2. Add an extra overflow bit for each vector element in a vector register. This adds an extra complication for saving the overflow bits when a vector register is spilled to memory.
  3. Use exception traps (interrupts) to catch integer overflow. This requires a minimum of complexity to the instruction set, but it has a problem with vector code.
  4. Use the even-numbered elements in a vector register for calculation and the odd-numbered elements for propagating overflow information. This is relatively easy to implement but wasteful because only half of the ALUs are used.
Method 1 has been rejected. The other three methods are still being considered for use in ForwardCom.
What about:
5. Same as 3, and use "traditional" masking on all SIMD instructions.
In traditional masking, an SIMD instruction working on vectors of n operands takes an n-bit mask. For each lane i: If bit i is set, the operation is performed normally on lane i (including triggering exceptions as needed). If bit i is unset, nothing happens on lane i (including silencing exceptions). If multiple exceptions happen in the same vector, you may either report them all at once by exposing an exception mask, or report them one by one in a well-defined order.

AVX-512 as well as AMD and Nvidia GPUs work that way, and so do most SIMD machines since the ILLIAC-IV, 45 years ago. The only exceptions were the MMX/SSE/AVX-style short-vector instruction sets, which were not "real" SIMD (at least not until AVX-512). There is no problem with exceptions in SIMD code, as long as per-lane masking is properly defined.

There is some overhead for handling such masking in an out-of-order processor as we discussed, but zeroing can avoid this overhead in many cases.

 
thread Proposal for instruction set - now on Github new - Agner - 2016-06-26
replythread Proposal for instruction set - now on Github new - Joe Duarte - 2016-07-04
last replythread Proposal for instruction set - now on Github new - Agner - 2016-07-04
replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-07-06
last replythread Proposal for instruction set - now on Github new - Agner - 2016-07-06
last replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-07-07
last reply Proposal for instruction set - now on Github new - Agner - 2016-07-07
replythread Whole-function vectorization and conditionals new - Sylvain Collange - 2016-08-15
last replythread Whole-function vectorization and conditionals new - Agner - 2016-08-15
last replythread Whole-function vectorization and conditionals new - Sylvain Collange - 2016-08-15
last replythread Whole-function vectorization and conditionals new - Agner - 2016-08-15
last replythread Whole-function vectorization and conditionals new - Sylvain Collange - 2016-08-15
last replythread Whole-function vectorization and conditionals new - Agner - 2016-08-15
reply Number of input dependencies new - Agner - 2016-08-16
last replythread Whole-function vectorization and conditionals new - Sylvain Collange - 2016-08-16
last replythread Whole-function vectorization and conditionals new - Agner - 2016-08-17
last replythread Merging with first operand new - Sylvain Collange - 2016-08-18
last replythread Merging with first operand new - Agner - 2016-08-19
replythread SIMD exceptions are fine with masking - Sylvain Collange - 2016-08-19
last replythread SIMD exceptions are fine with masking new - Agner - 2016-08-20
reply SIMD exceptions are fine with masking new - Hubert Lamontagne - 2016-08-20
last reply SIMD exceptions are fine with masking new - Sylvain Collange - 2016-08-25
last reply Merging with first operand new - Hubert Lamontagne - 2016-08-19
last replythread Proposal for instruction set - now on Github new - Joe Duarte - 2016-08-17
last replythread Proposal for instruction set - now on Github new - Agner - 2016-08-18
last replythread Proposal for instruction set - now on Github new - Joe Duarte - 2016-08-31
reply Proposal for instruction set - now on Github new - Agner - 2016-08-31
last reply Proposal for instruction set - now on Github new - Jorcy Neto - 2016-09-01
replythread Proposal for instruction set - now on Github new - Yuhong Bao - 2016-07-12
last reply Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-07-12
replythread Things from MIPS (and novel things) new - Anonymous - 2016-07-28
replythread Things from MIPS (and novel things) new - Agner - 2016-07-28
last reply Things from MIPS (and novel things) new - Hubert Lamontagne - 2016-07-28
last replythread Matrix multiplication new - Agner - 2016-07-29
reply Matrix multiplication new - Hubert Lamontagne - 2016-07-29
last replythread Matrix multiplication new - John D. McCalpin - 2016-07-29
last reply Matrix multiplication new - Agner - 2016-07-29
replythread Introduction website new - Agner - 2016-08-01
last replythread Introduction website new - EricTL - 2017-07-17
last replythread Introduction website new - Agner - 2017-07-18
last replythread Introduction website new - EricTL - 2017-07-20
last reply Introduction website new - Agner - 2017-07-20
replythread Proposal for instruction set - now on Github new - Joe Duarte - 2016-08-04
last replythread Proposal for instruction set - now on Github new - Agner - 2016-08-04
last replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-08-05
replythread Proposal for instruction set - now on Github new - Agner - 2016-08-06
last replythread Proposal for instruction set - now on Github new - fanoI - 2016-08-08
last replythread Proposal for instruction set - now on Github new - Agner - 2016-08-08
last reply Proposal for instruction set - now on Github new - fanoI - 2016-08-09
last replythread Proposal for instruction set - now on Github new - Joe Duarte - 2016-08-08
last replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-08-09
last replythread Proposal for instruction set - now on Github new - Joe Duarte - 2016-08-11
last replythread Proposal for instruction set - now on Github new - Agner - 2016-08-12
last reply Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-08-12
replythread Proposal for instruction set - now on Github new - grant galitz - 2016-08-22
reply Proposal for instruction set - now on Github new - Agner - 2016-08-22
last reply Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-08-24
replythread ARM with scalable vector extensions new - Agner - 2016-08-23
replythread ARM with scalable vector extensions new - Jorcy Neto - 2016-08-23
last reply ARM with scalable vector extensions new - Hubert Lamontagne - 2016-08-26
last reply ARM with scalable vector extensions new - Jorcy Neto - 2016-12-20
replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-09-05
replythread Proposal for instruction set - now on Github new - Agner - 2016-09-05
replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-09-05
last replythread Proposal for instruction set - now on Github new - Agner - 2016-09-06
reply Proposal for instruction set - now on Github new - Bigos - 2016-09-06
last replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-09-06
last replythread Proposal for instruction set - now on Github new - Agner - 2016-09-07
last replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-09-07
last replythread Proposal for instruction set - now on Github new - Agner - 2016-09-08
last reply Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-09-08
last replythread Proposal for instruction set - now on Github new - Commenter - 2016-09-07
last reply Proposal for instruction set - now on Github new - Bigos - 2016-09-08
last replythread Paging new - Kurt Baumgardner - 2016-09-09
replythread Paging new - Agner - 2016-09-10
reply Paging new - Hubert Lamontagne - 2016-09-11
last replythread Paging new - Kurt Baumgardner - 2016-09-13
replythread Paging new - Agner - 2016-09-13
last reply Paging new - Kurt Baumgardner - 2016-09-13
last replythread Paging new - Hubert Lamontagne - 2016-09-13
last reply Paging new - Kurt Baumgardner - 2016-09-14
replythread Paging new - Hubert Lamontagne - 2016-09-11
last reply Paging new - Kurt Baumgardner - 2016-09-13
last replythread Paging new - Agner - 2016-09-14
last reply Paging new - Jorcy Neto - 2016-09-18
replythread A null register? new - csdt - 2016-09-23
last replythread A null register? new - Agner - 2016-09-24
last replythread A null register? new - Hubert Lamontagne - 2016-09-24
replythread A null register? new - csdt - 2016-09-26
last reply A null register? new - Agner - 2016-09-27
last replythread Indexed registers new - Kurt Baumgardner - 2016-09-26
last replythread Indexed registers new - Agner - 2016-09-27
replythread Indexed registers new - Kurt Baumgardner - 2016-09-27
last reply Indexed registers new - Agner - 2016-09-28
last replythread Indexed registers new - Hubert Lamontagne - 2016-09-28
last replythread Indexed registers new - Kurt Baumgardner - 2016-10-03
reply Indexed registers new - Agner - 2016-10-03
last replythread Indexed registers new - Hubert Lamontagne - 2016-10-04
last replythread Bilinear Interpolation new - Hubert Lamontagne - 2016-10-28
last replythread Bilinear Interpolation new - Agner - 2016-10-29
last replythread Bilinear Interpolation new - Hubert Lamontagne - 2016-10-29
last replythread Bilinear Interpolation new - Agner - 2016-10-30
last reply Bilinear Interpolation new - Hubert Lamontagne - 2016-10-30
replythread ForwardCom version 1.04 new - Agner - 2016-12-08
replythread ForwardCom version 1.04 new - Matthias Bentrup - 2016-12-12
last replythread ForwardCom version 1.04 new - Agner - 2016-12-12
last reply ForwardCom version 1.04 new - Matthias Bentrup - 2016-12-14
last replythread Async system calls; horizontal packing instruction new - Joe Duarte - 2016-12-14
reply Async system calls; horizontal packing instruction new - Agner - 2016-12-15
last replythread Comparison of instruction sets new - Agner - 2016-12-17
replythread Comparison of instruction sets new - Joe Duarte - 2016-12-28
reply Comparison of instruction sets new - Agner - 2016-12-29
last reply Comparison of instruction sets new - Hubert Lamontagne - 2016-12-30
last reply Comparison of instruction sets new - Hubert Lamontagne - 2017-01-05
replythread ForwardCom version 1.05 new - Agner - 2017-01-22
replythread Syscall/ISR acceleration new - Jonathan Brandmeyer - 2017-01-22
last replythread Syscall/ISR acceleration new - Agner - 2017-01-23
last replythread Syscall/ISR acceleration new - Jonathan Brandmeyer - 2017-01-25
last reply Syscall/ISR acceleration new - Agner - 2017-01-25
replythread ForwardCom version 1.05 new - Jiří Moravec - 2017-01-23
last reply ForwardCom version 1.05 new - Agner - 2017-01-24
last replythread Jump prefetch? new - csdt - 2017-01-27
last replythread Jump prefetch? new - Agner - 2017-01-27
last replythread Jump prefetch? new - csdt - 2017-01-30
last replythread Jump prefetch? new - Agner - 2017-01-30
last replythread Jump prefetch? new - csdt - 2017-01-30
replythread Jump prefetch? new - Agner - 2017-01-31
reply Jump prefetch? new - csdt - 2017-01-31
last replythread Jump prefetch? new - Hubert Lamontagne - 2017-02-01
last replythread Jump prefetch? new - Agner - 2017-02-01
last replythread Jump prefetch? new - Hubert Lamontagne - 2017-02-01
last replythread Jump prefetch? new - Agner - 2017-02-02
last reply Jump prefetch? new - Agner - 2017-02-14
last replythread Jump prefetch? new - Hubert Lamontagne - 2017-01-31
last replythread High precision arithmetic new - fanoI - 2017-03-21
last reply High precision arithmetic new - Agner - 2017-03-21
replythread Intel's Control-flow Enforcement Technology new - Joe Duarte - 2017-04-13
last reply Intel's Control-flow Enforcement Technology new - Agner - 2017-04-14
reply Proposal for instruction set - now on Github new - Agner - 2017-04-27
replythread Assembler with metaprogramming features new - Agner - 2017-07-27
last replythread Assembler with metaprogramming features new - Kai Rese - 2017-08-11
last replythread Assembler with metaprogramming features new - Agner - 2017-08-11
last replythread Assembler with metaprogramming features new - Kai Rese - 2017-08-14
last replythread Assembler with metaprogramming features new - Agner - 2017-08-14
last reply Assembler with metaprogramming features new - Kai Rese - 2017-08-15
replythread Number of register file ports in implementations new - Hubert Lamontagne - 2017-08-22
last replythread Number of register file ports in implementations new - Agner - 2017-08-23
last replythread Number of register file ports in implementations new - Hubert Lamontagne - 2017-08-27
last replythread Number of register file ports in implementations new - Agner - 2017-08-28
reply Number of register file ports in implementations new - Bigos - 2017-08-28
last reply Number of register file ports in implementations new - Hubert Lamontagne - 2017-08-28
replythread Proposal for instruction set - now on Github new - yeengief - 2017-09-20
replythread Proposal for instruction set - now on Github new - Agner - 2017-09-20
last replythread Proposal for instruction set - now on Github new - yeengief - 2017-09-20
last replythread Proposal for instruction set - now on Github new - Agner - 2017-09-20
last replythread Proposal for instruction set - now on Github new - yeengief - 2017-09-21
last replythread Proposal for instruction set - now on Github new - Agner - 2017-09-21
last replythread Proposal for instruction set - now on Github new - yeengief - 2017-09-21
last reply Proposal for instruction set - now on Github new - Agner - 2017-09-23
replythread Proposal for instruction set - now on Github new - - - 2017-09-22
last reply Proposal for instruction set - now on Github new - Agner - 2017-09-23
last replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2017-09-25
last replythread Proposal for instruction set - now on Github new - Agner - 2017-09-26
last reply Proposal for instruction set - now on Github new - Hubert Lamontagne - 2017-09-26
last reply New assembler, new version, new forum new - Agner - 2017-11-03