Search found 4 matches

by ...
2023-07-31, 10:24:05
Forum: Agner's CPU blog
Topic: Intel AVX10 & APX announcement
Replies: 7
Views: 209380

Re: Intel AVX10 & APX announcement

ARM SVE/2 can vary vector length between 128 bits and 2048 bits, at 128-bit increments. I don't see any reason to adopt this in x86 since you can just mask off the unused part of a vector when saving it. Might be worth pointing out that ARM has published an errata (see C215) which restricts SVE vec...
by ...
2023-07-30, 10:32:37
Forum: Agner's CPU blog
Topic: Intel AVX10 & APX announcement
Replies: 7
Views: 209380

Re: Intel AVX10 & APX announcement

Responding to "What do you guys think?": Why would any other CPU vendor want to adopt this ISA extension? AFAIK AVX10.1 is just Sapphire Rapids' level AVX-512 renamed, with some new CPUID bits. I don't see why AMD (assuming they adopt FP16) would choose not to support it. AVX10.2 doesn't seem to cha...
by ...
2022-04-25, 6:48:11
Forum: Agner's CPU blog
Topic: Intel's new Chimera: Alder Lake
Replies: 14
Views: 696724

Re: Intel's new Chimera: Alder Lake

Thanks for the writeup.
If it helps, I have an AVX512 enabled 12700K if there's some program/code you want me to run on it.

If you want to get such a chip yourself, I wrote an article regarding requirements.
by ...
2021-10-04, 11:35:55
Forum: Agner's CPU blog
Topic: Intel Floating Point Executing 3 to 4 Times Faster Than it Should. MAKES NO SENSE
Replies: 4
Views: 88726

Re: Intel Floating Point Executing 3 to 4 Times Faster Than it Should. MAKES NO SENSE

Pre-loading the destination register of the multiply speeds up this loop by 5 times. Loading the source register does not speed it up. There's no way that loop should be able to run faster than the 5 clock cycle latency of the multiply instruction, and yet it does. This should be impossible. Even m...