Search found 76 matches

by agner
2024-10-31, 15:32:35
Forum: Agner's CPU blog
Topic: Testp and github
Replies: 1
Views: 782

Re: Testp and github

Good idea. I will put testp on github when I get the time. I don't have the time and resources to test every new microprocessor family on the market, so it will be good if other people can help.

Sorry for the bug. I have removed the extra case label.
by agner
2023-12-25, 8:34:40
Forum: Agner's CPU blog
Topic: Is using BSF instruction instead of using GNU C __builtin_ctz inefficient?
Replies: 1
Views: 29251

Re: Is using BSF instruction instead of using GNU C __builtin_ctz inefficient?

__builtin_ctz is not portable to all compilers. I don't think there is any difference in performance. Let's keep this discussion on stackoverflow. Remember to use the tag "vector-class-library" on stackoverflow.
by agner
2023-09-06, 10:06:59
Forum: Agner's CPU blog
Topic: Intel's "cripple AMD" function
Replies: 6
Views: 362371

Re: Intel's "cripple AMD" function

Karalinda wrote:
In many cases, there are no good alternatives to Intel's function libraries
Apparently, it is now possible to use the Intel function libraries without the cripple feature. See my previous post "New Clang-based Intel compiler is better"
by agner
2023-08-27, 5:39:08
Forum: Agner's CPU blog
Topic: Suggestion: Stop using "vector" for computer science
Replies: 1
Views: 34547

Re: Suggestion: Stop using "vector" for computer science

Language evolves. I am not sure this is the right forum to discuss this.
by agner
2023-08-25, 6:22:49
Forum: Agner's CPU blog
Topic: Testp Question
Replies: 1
Views: 33654

Re: Testp Question

An optimizing assembler should code mov rax,123 as mov eax,123 because the result is zero-extended into rax anyway. The two instructions should give identical results. Test results may vary for random reasons. Zero extension cannot be used with negative constants. mov rax,-123 is two bytes longer th...
by agner
2023-07-31, 13:14:35
Forum: Agner's CPU blog
Topic: Intel AVX10 & APX announcement
Replies: 7
Views: 210113

Re: Intel AVX10 & APX announcement

APX, on the other hand, does add decoder complexity. X86 until AVX512 already has 15 - 18 different prefixes, depending on how you count. APX adds just one more prefix (REX2) and extends the number of uses of an existing one (EVEX). This is just an incremental increase in complexity. It should be p...
by agner
2023-07-30, 11:54:04
Forum: Agner's CPU blog
Topic: Intel AVX10 & APX announcement
Replies: 7
Views: 210113

Re: Intel AVX10 & APX announcement

We have no promise that in 10 years 1024-bit AVX1024 vector won't crush on 512-FPU. And so we'll have to reinvent the wheel for another time, again. The EVEX prefix used by AVX512 and AVX10 has space for extensions to 1024 bit vectors, but not 2048. SVE/2 might support scaling vector width ARM SVE/...
by agner
2023-07-30, 6:41:13
Forum: Agner's CPU blog
Topic: Intel AVX10 & APX announcement
Replies: 7
Views: 210113

Re: Intel AVX10 & APX announcement

Thanks for the links. As far as I can see from the manuals, the future AVX10.2 processors will be binary compatible with existing AVX512 code. You only have to recompile the code if you want to use the extra registers and new instructions. The advantages of the new features are limited, so I don't e...
by agner
2023-07-20, 17:17:54
Forum: Agner's CPU blog
Topic: AMD processors do allow you to change the CPUID string
Replies: 1
Views: 38378

Re: AMD processors do allow you to change the CPUID string

Sorry, you can change the CPU name string, but not the "vendor string" that says AuthenticAMD. It is the vendor string that is checked by Intel software.