Agner`s CPU blog

Software optimization resources | E-mail subscription to this blog | www.agner.org

Proposal for an ideal extensible instruction set
Author: Agner Date: 2016-02-24 00:58
Add with carry:
Your idea of a 4-input add with carry is interesting. I see a few problems, though:
* You will need more space in the opcode to contain 4 registers.
* Handling multiple inputs is just as difficult as multiple outputs. For many years, Intel had a limitation of 2 inputs per micro-operation, and they had to split add-with-carry and several other instructions into 2 microoperations for the same reason.
* The instruction might have a latency of 2 clock cycles unless you can implement it with two double-speed adders. Mixing instructions with different latencies is a problem because a 2-clock instruction may need the result bus at the same time as a subsequent 1-clock instruction. It is best to standardize instruction latencies and have as few different latencies as possible.
* It is problematic to make a hardware design that cannot handle instructions with two outputs. You also need two outputs for integer division (output quotient and remainder) and for full length integer multiplication.

I can see the following possibilities for implementing 2-output functions:
1. Use one instruction with two output registers.
2. Use two separate instructions, possibly executed simultaneously, one for each output.
3. Use two elements of a vector register.

Method 1 would certainly be the most efficient and straightforward solution. We just have to weigh the hardware costs versus the benefits.
Method 2 is less efficient. For example, for integer division, you cannot make two divisions simultaneously, or even pipelined, unless you double the hardware.
Method 3 might be an efficient solution for a scalar add-with-carry chain, but for other purposes, it will complicate the software when you have to split and join data into vector elements. It becomes more complicated when you want to vectorize vectors. You may use even-numbered vector elements for addend and sum, and odd-numbered vector elements for carry. This will complicate both hardware and software, and the throughput will be half of what you would get with method 1.

Instruction length:
As I argued before, you need 32 bits for address offset, so you must allow instructions of two 32-bit words.

Many instruction sets don't allow big immediate constants. For example, to load a 32-bit constant you need a memory operand with a 32-bit offset. My argument is that it is more efficient to have a 32-bit immediate operand than a 32-bit offset to a 32-bit memory operand. This will reduce the loading on the data cache. Remember that cache misses are very expensive.

In my analysis I found that you may need instructions of three 32-bit words to accommodate all the bells and whistles of vector instructions with a memory operand, variable vector length, mask, etc. There may be more needs for long instructions in the future as new features are invented. At least the current trend goes towards putting more features into a single instruction to get higher overall performance. This is the reason for my decision to allow instruction lengths of one, two and three 32-bit words. This is certainly a compromise, since instruction length decoding becomes more expensive the more different instruction lengths you have.

If we decide to allow an instruction length of 3*32 bits then we can afford the luxury of allowing immediate constants of 64 bits, for example a double precision float or a 64-bit absolute address.

On the other hand, if we limit the instruction length to 2*32 bits, then there will be certain instructions that cannot have a memory operand with 32-bit offset, and we will need two instructions to load a 64-bit immediate constant. This would still be a viable solution, but I suspect that a patch would be added in the future when the need for more instruction bits arise as more features are added. Remember how many patches have been added to old instruction sets.

 
thread Proposal for an ideal extensible instruction set new - Agner - 2015-12-27
replythread Itanium new - Ethan - 2015-12-28
last reply Itanium new - Agner - 2015-12-28
replythread Proposal for an ideal extensible instruction set new - hagbardCeline - 2015-12-28
last replythread Proposal for an ideal extensible instruction set new - Agner - 2015-12-28
reply Proposal for an ideal extensible instruction set new - Adrian Bocaniciu - 2016-01-04
reply Proposal for an ideal extensible instruction set new - Adrian Bocaniciu - 2016-01-04
reply Proposal for an ideal extensible instruction set new - Adrian Bocaniciu - 2016-01-04
replythread Proposal for an ideal extensible instruction set new - Adrian Bocaniciu - 2016-01-04
reply Proposal for an ideal extensible instruction set new - Adrian Bocaniciu - 2016-01-05
replythread Proposal for an ideal extensible instruction set new - John D. McCalpin - 2016-01-05
last reply Proposal for an ideal extensible instruction set new - Adrian Bocaniciu - 2016-01-06
last reply Proposal for an ideal extensible instruction set new - Ook - 2016-01-05
last reply Proposal for an ideal extensible instruction set new - acppcoder - 2016-03-27
reply Proposal for an ideal extensible instruction set new - Jake Stine - 2016-01-11
replythread Proposal for an ideal extensible instruction set new - Agner - 2016-01-12
last replythread Proposal for an ideal extensible instruction set new - Jonathan Morton - 2016-02-02
last replythread Proposal for an ideal extensible instruction set new - Agner - 2016-02-03
last replythread Proposal for an ideal extensible instruction set new - Jonathan Morton - 2016-02-12
last replythread Proposal for an ideal extensible instruction set new - Hubert Lamontagne - 2016-02-18
last replythread Proposal for an ideal extensible instruction set new - Agner - 2016-02-21
last replythread Proposal for an ideal extensible instruction set new - Hubert Lamontagne - 2016-02-22
last replythread Proposal for an ideal extensible instruction set new - Agner - 2016-02-23
replythread Proposal for an ideal extensible instruction set new - Hubert Lamontagne - 2016-02-23
last replythread Proposal for an ideal extensible instruction set - Agner - 2016-02-24
last replythread Proposal for an ideal extensible instruction set new - asdf - 2016-02-24
last reply Proposal for an ideal extensible instruction set new - Agner - 2016-02-24
last reply Proposal for an ideal extensible instruction set new - Agner - 2016-02-25
replythread limit instruction length to power of 2 new - A-11 - 2016-02-24
last replythread limit instruction length to power of 2 new - Agner - 2016-02-24
replythread Any techniques for more than 2 loads per cycle? new - Hubert Lamontagne - 2016-02-24
last reply Any techniques for more than 2 loads per cycle? new - Agner - 2016-02-25
last replythread limit instruction length to power of 2 new - A-11 - 2016-02-25
last reply limit instruction length to power of 2 new - Hubert Lamontagne - 2016-02-25
replythread More ideas new - Agner - 2016-03-04
replythread More ideas new - Hubert Lamontagne - 2016-03-07
last reply More ideas new - Agner - 2016-03-08
last reply More ideas new - Agner - 2016-03-09
replythread Proposal for an ideal extensible instruction set new - Joe Duarte - 2016-03-07
reply Proposal for an ideal extensible instruction set new - Agner - 2016-03-08
last replythread Proposal for an ideal extensible instruction set new - Hubert Lamontagne - 2016-03-08
last replythread Proposal for an ideal extensible instruction set new - Joe Duarte - 2016-03-09
last replythread Proposal for an ideal extensible instruction set new - Agner - 2016-03-10
last replythread Proposal for an ideal extensible instruction set new - Hubert Lamontagne - 2016-03-11
last replythread Proposal for an ideal extensible instruction set new - Agner - 2016-03-11
last replythread Proposal for an ideal extensible instruction set new - anon2718 - 2016-03-13
last reply Proposal for an ideal extensible instruction set new - Agner - 2016-03-14
replythread A design without a TLB new - Agner - 2016-03-11
replythread A design without a TLB new - Hubert Lamontagne - 2016-03-11
reply A design without a TLB new - Agner - 2016-03-11
last reply A design without a TLB new - Agner - 2016-03-12
reply A design without a TLB new - Bigos - 2016-03-13
last reply A design without a TLB new - Agner - 2016-03-28
replythread Proposal now published new - Agner - 2016-03-22
last replythread Proposal now published new - Hubert Lamontagne - 2016-03-23
last replythread Proposal now published new - Agner - 2016-03-24
last replythread Proposal now published new - Hubert Lamontagne - 2016-03-24
last replythread Proposal now published new - Agner - 2016-03-24
last replythread Proposal now published new - Hubert Lamontagne - 2016-03-24
last replythread Proposal now published new - Agner - 2016-03-25
last replythread Proposal now published new - Hubert Lamontagne - 2016-03-28
last replythread Proposal now published new - Agner - 2016-03-29
last replythread Proposal now published new - Hubert Lamontagne - 2016-03-30
last replythread Proposal now published new - Agner - 2016-03-30
last replythread Do we need instructions with two outputs? new - Agner - 2016-03-31
last replythread Do we need instructions with two outputs? new - Hubert Lamontagne - 2016-04-01
reply Do we need instructions with two outputs? new - Agner - 2016-04-01
replythread Do we need instructions with two outputs? new - Joe Duarte - 2016-04-02
last replythread Do we need instructions with two outputs? new - Agner - 2016-04-02
last reply Do we need instructions with two outputs? new - Joe Duarte - 2016-04-02
last replythread Do we need instructions with two outputs? new - Agner - 2016-04-02
last replythread Do we need instructions with two outputs? new - Hubert Lamontagne - 2016-04-02
last replythread Do we need instructions with two outputs? new - Agner - 2016-04-03
reply Do we need instructions with two outputs? new - Joe Duarte - 2016-04-03
last replythread Do we need instructions with two outputs? new - Hubert Lamontagne - 2016-04-03
last replythread Do we need instructions with two outputs? new - Agner - 2016-04-04
reply Do we need instructions with two outputs? new - Hubert Lamontagne - 2016-04-04
last replythread Do we need instructions with two outputs? new - Joe Duarte - 2016-04-06
last replythread Do we need instructions with two outputs? new - Hubert Lamontagne - 2016-04-07
last replythread Do we need instructions with two outputs? new - HarryDev - 2016-04-08
last reply Do we need instructions with two outputs? new - Hubert Lamontagne - 2016-04-09
replythread How about stack machine ISA? new - A-11 - 2016-04-10
last replythread treating stack ISA as CISC architecure new - A-11 - 2016-04-14
last replythread treating stack ISA as CISC architecure new - Agner - 2016-04-14
last replythread treating stack ISA as CISC architecure new - A-11 - 2016-04-17
replythread treating stack ISA as CISC architecure new - Hubert Lamontagne - 2016-04-17
last replythread stack ISA versus long vectors new - Agner - 2016-04-18
last replythread stack ISA versus long vectors new - Hubert Lamontagne - 2016-04-19
last reply stack ISA versus long vectors new - Agner - 2016-04-20
last reply treating stack ISA as CISC architecure new - A-11 - 2016-04-18
replythread Proposal for an ideal extensible instruction set new - zboson - 2016-04-11
last replythread Proposal for an ideal extensible instruction set new - Agner - 2016-04-11
last replythread Proposal for an ideal extensible instruction set new - Hubert Lamontagne - 2016-04-11
last replythread Proposal for an ideal extensible instruction set new - Agner - 2016-04-12
last reply Proposal for an ideal extensible instruction set new - Hubert Lamontagne - 2016-04-12
replythread Version 1.01 new - Agner - 2016-05-10
last replythread Version 1.01 new - Hubert Lamontagne - 2016-05-13
last replythread Version 1.01 new - Agner - 2016-05-14
last replythread Version 1.01 new - Harry - 2016-06-02
replythread Public repository new - Agner - 2016-06-02
reply Public repository new - Harry - 2016-06-02
last reply Public repository new - Harry - 2016-06-02
last reply Public repository new - Agner - 2016-06-09
replythread Rethinking DLLs and shared objects new - Agner - 2016-05-20
replythread Rethinking DLLs and shared objects new - cv - 2016-05-20
last reply Rethinking DLLs and shared objects new - Agner - 2016-05-20
replythread Rethinking DLLs and shared objects new - Peter Cordes - 2016-05-30
last replythread Rethinking DLLs and shared objects new - Agner - 2016-05-30
last replythread Rethinking DLLs and shared objects new - Joe Duarte - 2016-06-17
last replythread Rethinking DLLs and shared objects new - Agner - 2016-06-18
last reply Rethinking DLLs and shared objects new - Bigos - 2016-06-18
last replythread Rethinking DLLs and shared objects new - Freddie Witherden - 2016-06-02
last replythread Rethinking DLLs and shared objects new - Agner - 2016-06-04
last replythread Rethinking DLLs and shared objects new - Freddie Witherden - 2016-06-04
last reply Rethinking DLLs and shared objects new - Agner - 2016-06-06
replythread Is it better to have two stacks? new - Agner - 2016-06-05
reply Is it better to have two stacks? new - Hubert Lamontagne - 2016-06-07
replythread Is it better to have two stacks? new - Eden Segal - 2016-06-13
last replythread Is it better to have two stacks? new - Agner - 2016-06-13
last replythread Is it better to have two stacks? new - Hubert Lamontagne - 2016-06-14
last replythread Is it better to have two stacks? new - Agner - 2016-06-14
last replythread Is it better to have two stacks? new - Hubert Lamontagne - 2016-06-15
last replythread Is it better to have two stacks? new - Agner - 2016-06-15
last replythread Is it better to have two stacks? new - Hubert Lamontagne - 2016-06-16
last replythread Is it better to have two stacks? new - Agner - 2016-06-16
last reply Is it better to have two stacks? new - Hubert Lamontagne - 2016-06-17
last reply Is it better to have two stacks? new - Freddie Witherden - 2016-06-22
last reply Now on Github new - Agner - 2016-06-26