Agner`s CPU blog

Software optimization resources | E-mail subscription to this blog |

thread Function library update, and more - Agner - 2011-08-04
reply Disassembler support for Knights Bridge - Agner - 2012-08-24
last reply Function library update, and more - Steve - 2013-09-24
Function library update, and more
Author: Agner Date: 2011-08-04 05:00
My function library asmlib has been updated and functions made even faster. Some of the new features are:
  • Fast functions for string searching, using the very efficient SSE4.2 instructions
  • Fast functions for string parsing, searching for characters that belong to an arbitrary set, such as delimiters, whitespace, etc.
  • UTF-8 character count
  • Fast integer division when the same divisor is used multiple times
  • Fast integer vector division

Supports Windows, Linux, BSD, Mac, 32- and 64-bit.

My disassembler objconv has also been updated, now with support for the future AVX2 instruction set.

Disassembler support for Knights Bridge
Author: Agner Date: 2012-08-24 05:07
The disassembler objconv has now been updated with support for the forthcoming Intel "Many Integrated Core" (MIC) coprocessor code named Knights Corner. See Knights Corner Instruction Set Reference.

This instruction set extends the size of vector registers from 128-bits xmm registers and 256-bits ymm registers to 512-bits zmm registers. The number of vector registers is extended to 32 registers named zmm0 - zmm31 in 64-bit mode. The number of vector registers in 32-bit mode has not been not clarified, but it looks like only zmm0 - zmm7 are available in 32-bit mode for technical reasons. There is no extension to the general purpose registers. The vector instructions have many new attributes for masked operations, type conversion, broadcast, permutation, cache eviction hint, rounding mode, and suppression of exceptions in addition to the primary function of each instruction.

The first application of this new instruction set will be on the new Intel MIC supercomputing platform, but there are rumors that this new instruction set (or at least the major parts of it) will make it into the mainline x86 ecosystem. The Knights Corner instruction set is carefully designed to be compatible with existing x86 and x64 code.

Function library update, and more
Author:  Date: 2013-09-24 15:39
Feature request please?



Excellent library BTW.